Skip to main content

Snowflake

Allows read and write operations on Snowflake.

How to configure Key Pair Authentication on Snowflake?

Please refer the snowflake official documentation to attach private key to user account: Configuring Key Pair Authentication.

Source

Source Parameters

ParameterDescriptionRequired
Dataset NameName of the DatasetTrue
Credential TypeCredential Type: Databricks Secrets or Username & PasswordTrue
CredentialsDatabricks credential name, else username and password for the snowflake accountRequired if Credential Type is Databricks Secrets
UsernameLogin name for the Snowflake userRequired if Credential Type is Username & Password
PasswordPassword for the Snowflake userRequired if Credential Type is Username & Password
Private key filepathLocation of Private key filepath in PKCS8 formatRequired if Private Key Details is enabled
Private key passphrasePassphrase of Private key fileRequired if private key file is passphrase enabled
UrlHostname for your account in the format: <account_identifier>.snowflakecomputing.com.
E.g.: https://DJ07623.ap-south-1.aws.snowflakecomputing.com
True
DatabaseDatabase to use for the session after connectingTrue
SchemaSchema to use for the session after connectingTrue
WarehouseDefault virtual warehouse to use for the session after connectingFalse
RoleDefault security role to use for the session after connectingFalse
Data SourceStrategy to read data: DB Table or SQL Query.True
TableName of the table to be read. All columns and records are retrieved (i.e. it is equivalent to SELECT * FROM table).Required if Data Source is DB Table
SQL QueryExact query (SELECT statement) to runRequired if Data Source is SQL Query

Example

Generated Code

def sf_customer(spark: SparkSession) -> DataFrame:
from pyspark.dbutils import DBUtils

return spark.read\
.format("snowflake")\
.options(
**{
"sfUrl": "https://DJ07623.ap-south-1.aws.snowflakecomputing.com",
"sfUser": "anshuman",
"sfPassword": "*******",
"sfDatabase": "SNOWFLAKE_SAMPLE_DATA",
"sfSchema": "TPCDS_SF100TCL",
"sfWarehouse": "COMPUTE_WH"
}
)\
.option("dbtable", "CUSTOMER")\
.load()

Target

Target Parameters

ParameterDescriptionRequired
Dataset NameName of the DatasetTrue
Credential TypeCredential Type: Databricks Secrets or Username & PasswordTrue
CredentialsDatabricks credential name, else username and password for the snowflake accountRequired if Credential Type is Databricks Secrets
UsernameLogin name for the snowflake userRequired if Credential Type is Username & Password
PasswordPassword for the snowflake userRequired if Credential Type is Username & Password
Private key filepathLocation of Private key filepath in PKCS8 formatRequired if Private Key Details is enabled
Private key passphrasePassphrase of Private key fileRequired if private key file is passphrase enabled
UrlHostname for your account in the format: <account_identifier>.snowflakecomputing.com.
E.g.: https://DJ07623.ap-south-1.aws.snowflakecomputing.com
True
DatabaseDatabase to use for the session after connectingTrue
SchemaSchema to use for the session after connectingTrue
WarehouseDefault virtual warehouse to use for the session after connectingFalse
RoleDefault security role to use for the session after connectingFalse
TableName of the table to which data is to be writtenTrue
Write ModeHow to handle existing data. See this table for a list of available options.True
Post-Script SQLDDL/DML SQL statements to execute before writing data.
It is intended for statements that do not return a result set, for example DDL statements like CREATE TABLE and DML statements like INSERT, UPDATE, and DELETE.
It is not useful for statements that return a result set, such as SELECT or SHOW.
False

Supported Write Modes

Write ModeDescription
overwriteIf data already exists, overwrite with the contents of the DataFrame.
appendIf data already exists, append the contents of the DataFrame.
ignoreIf data already exists, do nothing with the contents of the DataFrame. This is similar to a CREATE TABLE IF NOT EXISTS in SQL.
errorIf data already exists, throw an exception.

Example

Generated Code

def customer_snow_tg(spark: SparkSession, in0: DataFrame):
from pyspark.dbutils import DBUtils
options = {
"sfUrl": "https://DJ07623.ap-south-1.aws.snowflakecomputing.com",
"sfUser": "anshuman",
"sfPassword": "******",
"sfDatabase": "SNOWFLAKE_SAMPLE_DATA",
"sfSchema": "TPCDS_SF100TCL",
"sfWarehouse": "COMPUTE_WH"
}
writer = in0.write.format("snowflake").options(**options)
writer = writer.option("dbtable", "CUSTOMERS")
writer = writer.mode("overwrite")
writer.save()