Skip to main content

Delta Table

Reads and writes Delta tables that are managed by the execution environment's Metadata catalog (Metastore).

note

Set the property provider to Delta on the properties page.

Source

Source Parameters

ParameterDescriptionRequired
Database nameName of the databaseTrue
Table nameName of the tableTrue
ProviderMust be set to DeltaTrue
Filter PredicateWhere clause to filter the tableFalse
Read TimestampTime travel to a specific timestampFalse
Read VersionTime travel to a specific version of the tableFalse
note

For time travel on Delta tables:

  1. Only Read Timestamp OR Read Version can be selected, not both.
  2. Timestamp should be between the first commit timestamp and the latest commit timestamp in the table.
  3. Version needs to be an integer with value between min and max version of table.

By default most recent version of each row is fetched if no time travel option is used.

info

To read more about Delta time travel and its use cases click here.

Source Example

Generated Code

Without filter predicate

def Source(spark: SparkSession) -> DataFrame:
return spark.read.table(f"test_db.test_table")

With filter predicate

def Source(spark: SparkSession) -> DataFrame:
return spark.sql("SELECT * FROM test_db.test_table WHERE col > 10")

Target

Target Parameters

ParameterDescriptionRequired
Database nameName of the databaseTrue
Table nameName of the tableTrue
Custom file pathUse custom file path to store underlying files.False
ProviderMust be set to DeltaTrue
Write ModeHow to handle existing data. See this table for a list of available options. (Default is set to error.)True
Use insert intoFlag to use insertInto method to write instead of saveFalse
Optimize writeIf true, it optimizes Spark partition sizes based on the actual data.False
Overwrite table schemaIf true, overwrites the schema of the Delta table.False
Merge schemaIf true, then any columns that are present in the DataFrame but not in the target table are automatically added on to the end of the schema as part of a write transaction.False
Partition ColumnsList of columns to partition the Delta table byFalse
Overwrite partition predicateIf specified, then it selectively overwrites only the data that satisfies the given where clause expression.False
note

Among these write modes overwrite, append, ignore, and error work the same way as with other native Spark-supported formats such as Parquet.

To read more about using merge write mode click here.

To read more about using SCD2 merge write mode click here.

Target Example

Generated Code

def Target(spark: SparkSession, in0: DataFrame):
in0.write\
.format("delta")\
.mode("overwrite")\
.saveAsTable("test_db.test_table")