Skip to main content

DeltaTableOperations

Spark Gem

Helps perform the following operations on Delta tables.

  1. Register table in catalog
  2. Vacuum table
  3. Optimize table
  4. Restore table
  5. Delete from table
  6. Drop table
  7. FSCK Repair table

Parameters

ParameterDescriptionRequired
Database nameDatabase nameFalse
Table nameTable nameFalse
File pathFile path for delta tableFalse
ActionAction to perform on the tableTrue
note

At least one value from table name or file path needs to be provided.

Example

Example usage of Delta Table Operations Gem

Register table in catalog

This will register the data at mentioned file path as a table in the whichever Metadata catalog is available in your execution environment.

Vacuum table

Recursively vacuum directories associated with the Delta table. VACUUM removes all files from the table directory that are not managed by Delta, as well as data files that are no longer in the latest state of the transaction log for the table and are older than a retention threshold. The default threshold is 7 days.

To learn more about vacuum click here.

Parameters

ParameterDescriptionRequired
Retention hoursRetention thresholdFalse

Optimize table

Optimizes the layout of Delta Table data. Optionally optimize a subset of data or colocate data by column. If colocation is not specified, bin-packing optimization is performed by default.

To learn more about optimize click here.

Parameters

ParameterDescriptionRequired
Where clauseOptimize the subset of rows matching the given partition predicate. Only filters involving partition key attributes are supported.False
ZOrder ByList of columns to perform ZOrder onFalse

Restore table

Restores a Delta table to an earlier state. Restoring to an earlier version number or a timestamp is supported.

Parameters

ParameterDescriptionRequired
Restore viaRestore the table via timestamp or versionFalse
ValueValue to restore onFalse

Delete from table

Delete removes the data from the latest version of the Delta table that matches the specified condition. Please note that delete does not remove it from the physical storage until the older versions are explicitly vacuumed.

Parameters

ParameterDescriptionRequired
Where clauseCondition which needs to be satisfied to delete a rowTrue

Drop table

This will drop the table from catalog and remove the files.

FSCK Repair table

Removes the file entries from the transaction log of a Delta table that can no longer be found in the underlying file system. This can happen when these files have been manually deleted.

To learn more about fsck repair click here.