Snowflake
Allows read and write operations on Snowflake
.
How to configure Key Pair Authentication on Snowflake?
Please refer the snowflake official documentation to attach private key to user account: Configuring Key Pair Authentication.
Source
Source Parameters
Parameter | Description | Required |
---|---|---|
Dataset Name | Name of the Dataset | True |
Credential Type | Credential Type: Databricks Secrets or Username & Password | True |
Credentials | Databricks credential name, else username and password for the snowflake account | Required if Credential Type is Databricks Secrets |
Username | Login name for the Snowflake user | Required if Credential Type is Username & Password |
Password | Password for the Snowflake user | Required if Credential Type is Username & Password |
Private key filepath | Location of Private key filepath in PKCS8 format | Required if Private Key Details is enabled |
Private key passphrase | Passphrase of Private key file | Required if private key file is passphrase enabled |
Url | Hostname for your account in the format: <account_identifier>.snowflakecomputing.com . E.g.: https://DJ07623.ap-south-1.aws.snowflakecomputing.com | True |
Database | Database to use for the session after connecting | True |
Schema | Schema to use for the session after connecting | True |
Warehouse | Default virtual warehouse to use for the session after connecting | False |
Role | Default security role to use for the session after connecting | False |
Data Source | Strategy to read data: DB Table or SQL Query . | True |
Table | Name of the table to be read. All columns and records are retrieved (i.e. it is equivalent to SELECT * FROM table ). | Required if Data Source is DB Table |
SQL Query | Exact query (SELECT statement) to run | Required if Data Source is SQL Query |
Example
Generated Code
- Python
- Scala
def sf_customer(spark: SparkSession) -> DataFrame:
from pyspark.dbutils import DBUtils
return spark.read\
.format("snowflake")\
.options(
**{
"sfUrl": "https://DJ07623.ap-south-1.aws.snowflakecomputing.com",
"sfUser": "anshuman",
"sfPassword": "*******",
"sfDatabase": "SNOWFLAKE_SAMPLE_DATA",
"sfSchema": "TPCDS_SF100TCL",
"sfWarehouse": "COMPUTE_WH"
}
)\
.option("dbtable", "CUSTOMER")\
.load()
object customer_snow_src {
def apply(spark: SparkSession): DataFrame = {
import com.databricks.dbutils_v1.DBUtilsHolder.dbutils
var reader = context.spark.read
.format("snowflake")
.options(
Map(
"sfUrl" → "https://DJ07623.ap-south-1.aws.snowflakecomputing.com",
"sfUser" → "anshuman",
"sfPassword" → "******",
"sfDatabase" → "SNOWFLAKE_SAMPLE_DATA",
"sfSchema" → "TPCDS_SF100TCL",
"sfWarehouse" → "COMPUTE_WH"
)
)
reader = reader.option("dbtable", "CUSTOMER")
reader.load()
}
}
Target
Target Parameters
Parameter | Description | Required |
---|---|---|
Dataset Name | Name of the Dataset | True |
Credential Type | Credential Type: Databricks Secrets or Username & Password | True |
Credentials | Databricks credential name, else username and password for the snowflake account | Required if Credential Type is Databricks Secrets |
Username | Login name for the snowflake user | Required if Credential Type is Username & Password |
Password | Password for the snowflake user | Required if Credential Type is Username & Password |
Private key filepath | Location of Private key filepath in PKCS8 format | Required if Private Key Details is enabled |
Private key passphrase | Passphrase of Private key file | Required if private key file is passphrase enabled |
Url | Hostname for your account in the format: <account_identifier>.snowflakecomputing.com . E.g.: https://DJ07623.ap-south-1.aws.snowflakecomputing.com | True |
Database | Database to use for the session after connecting | True |
Schema | Schema to use for the session after connecting | True |
Warehouse | Default virtual warehouse to use for the session after connecting | False |
Role | Default security role to use for the session after connecting | False |
Table | Name of the table to which data is to be written | True |
Write Mode | How to handle existing data. See this table for a list of available options. | True |
Post-Script SQL | DDL/DML SQL statements to execute before writing data. It is intended for statements that do not return a result set, for example DDL statements like CREATE TABLE and DML statements like INSERT, UPDATE, and DELETE .It is not useful for statements that return a result set, such as SELECT or SHOW . | False |
Supported Write Modes
Write Mode | Description |
---|---|
overwrite | If data already exists, overwrite with the contents of the DataFrame. |
append | If data already exists, append the contents of the DataFrame. |
ignore | If data already exists, do nothing with the contents of the DataFrame. This is similar to a CREATE TABLE IF NOT EXISTS in SQL. |
error | If data already exists, throw an exception. |
Example
Generated Code
- Python
- Scala
def customer_snow_tg(spark: SparkSession, in0: DataFrame):
from pyspark.dbutils import DBUtils
options = {
"sfUrl": "https://DJ07623.ap-south-1.aws.snowflakecomputing.com",
"sfUser": "anshuman",
"sfPassword": "******",
"sfDatabase": "SNOWFLAKE_SAMPLE_DATA",
"sfSchema": "TPCDS_SF100TCL",
"sfWarehouse": "COMPUTE_WH"
}
writer = in0.write.format("snowflake").options(**options)
writer = writer.option("dbtable", "CUSTOMERS")
writer = writer.mode("overwrite")
writer.save()
object customer_snow_tg {
def apply(spark: SparkSession, in: DataFrame): Unit = {
import net.snowflake.spark.snowflake.Utils
import com.databricks.dbutils_v1.DBUtilsHolder.dbutils
val options = Map("sfUrl" → "https://DJ07623.ap-south-1.aws.snowflakecomputing.com",
"sfUser" → "anshuman",
"sfPassword" → "******",
"sfDatabase" → "SNOWFLAKE_SAMPLE_DATA",
"sfSchema" → "TPCDS_SF100TCL",
"sfWarehouse" → "COMPUTE_WH"
)
var writer = in.write.format("snowflake").options(options)
writer = writer.option("dbtable", "test_table")
writer = writer.mode("overwrite")
writer.save()
}
}