Skip to main content

Databricks

This page describes how to use Databricks external Source and Target gems to read from or write to tables. Only use an external Source and Target gem when Databricks is not the configured SQL warehouse connection. Otherwise, use the Table gem.

info

If you’re working with file types like CSV or Parquet from Databricks, see File types for guidance. This page focuses only on catalog tables.

Source configuration

Use these settings to configure a Databricks Source gem for reading data.

Source location

ParameterDescription
Format typeTable format for the source. For Databricks tables, set to databricks.
Select or create connectionSelect or create a new Databricks connection in the Prophecy fabric you will use.
DatabaseDatabase including the schema where the table is located.
SchemaSchema containing the table you want to read from.
NameExact name of the Databricks table to read data from.

Target configuration

Use these settings to configure a BigQuery Target gem for writing data.

Target location

ParameterDescription
Format typeTable format for the target. For Databricks tables, set to databricks.
Select or create connectionSelect or create a new Databricks connection in the Prophecy fabric you will use.
DatabaseDatabase including the schema where the table is/will be located.
SchemaSchema where the target table will be created or updated.
NameName of the Databricks table to write data to. If the table doesn’t exist, it will be created automatically.

Target properties

PropertyDescriptionDefault
DescriptionDescription of the table.None
Write ModeWhether to overwrite the table completely, append new data to the table, or throw an error if the table exists.None

Cross-workspace access

If your fabric uses Databricks as the SQL warehouse, you can’t select Databricks in an external Source or Target gem. Instead, you must use Table gems, which are limited to the Databricks warehouse defined in the SQL warehouse connection.

To work with tables from a different Databricks workspace, use Delta Sharing. Delta Sharing lets you access data across workspaces without creating additional Databricks connections.

info

Prophecy implements this guardrail to avoid using external connections when the data can be made available in your warehouse. External connections introduce an extra data transfer step, which slows down pipeline execution and adds unnecessary complexity. For best performance, Prophecy always prefers reading and writing directly within the warehouse.