Redshift
You can read from and write to Redshift.
Parameters
Parameter | Tab | Description |
---|---|---|
Username | Location | Username for your JDBC instance. |
Password | Location | Password for your JDBC instance. |
JDBC URL | Location | JDBC URL to connect to. The source-specific connection properties may be specified in the URL. For example: - jdbc:postgresql://test.us-east-1.rds.amazonaws.com:5432/postgres - jdbc:mysql://database-mysql.test.us-east-1.rds.amazonaws.com:3306/mysql |
Temporary Directory | Location | S3 location to temporarily store data before it's loaded into Redshift. |
Data Source | Location | Strategy to read data. In the Source gem, you can select DB Table or SQL Query . In the Target gem, you must enter a table.To learn more, see DB Table and SQL Query. |
Schema | Properties | Schema to apply on the loaded data. In the Source gem, you can define or edit the schema visually or in JSON code. In the Target gem, you can view the schema visually or as JSON code. |
DB Table
The DB Table
option dictates which table to use as the source to read from. You can use anything valid in a FROM
clause of a SQL query. For example, instead of a table name, use a subquery in parentheses.
The DB Table
option and the query
parameter are mutually exclusive, which means that you cannot specify both at the same time.
SQL Query
The SQL Query
option specifies which query to use as a subquery in the FROM
clause. Spark also assigns an alias to the subquery clause. For example, Spark issues the following query to the JDBC Source:
SELECT columns FROM (<user_specified_query>) spark_gen_alias
The following restrictions exist when you use this option:
- You cannot use the
query
andpartitionColumn
options at the same time. - If you must specify the
partitionColumn
option, you can specify the subquery using thedbtable
option and qualify your partition columns using the subquery alias provided as part ofdbtable
.
Source
The Source gem reads data from Redshift and allows you to optionally specify the following additional properties.
Source properties
Property | Description | Default |
---|---|---|
Forward S3 access credentials to Databricks | Whether to forward S3 access credentials to Databricks. | false |
Driver | Class name of the Redshift driver to connect to this URL. | None |
AWS IAM Role | Identity that grants permissions to access other AWS services | None |
Temporary AWS access key id | Whether to allow temporary credentials for authenticating to Redshift. | false |
Target
The Target gem writes data to Redshift and allows you to optionally specify the following additional properties.
Target properties
Property | Description | Default |
---|---|---|
Forward S3 access credentials to Databricks | Whether to forward S3 access credentials to Databricks. | false |
Driver | Class name of the Redshift driver to connect to this URL. | None |
AWS IAM Role | Identity that grants permissions to access other AWS services | None |
Temporary AWS access key id | Whether to allow a temporary credential for authenticating to Redshift. | false |
Max length for string columns in redshift | Maximum length for string columns in Redshift. | 2048 |
Row distribution style for new table | How to distribute data in a new table. For a list of the possible values, see Supported distribution styles. | None |
Distribution key for new table | If you selected Key as the Row distribution style for new table property, specify the key to distribute by. | None |
Supported distribution styles
Distribution style | Description |
---|---|
EVEN | Distribute the rows across the slices in a round-robin fashion. |
KEY | Distribute according to the values in one column. |
ALL | A copy of the entire table is distributed to every node. |