Skip to main content

Databricks

In Prophecy, datasets stored in the SQL Warehouse Connection defined in your fabric are accessed using Table gems. Unlike other source and target gems, Table gems run directly within the data warehouse, eliminating extra orchestration steps and improving performance.

Available configurations for Table gems vary based on your SQL warehouse provider. This page explains how to use the Table gem for a Databricks SQL warehouse, including supported table types, configuration options, and guidance for managing Databricks tables in your Prophecy pipelines.

Table types

The following table types are supported for Databricks connections.

NameDescriptionType
TablePersistent storage of structured data in your SQL warehouse. Faster for frequent queries (indexed).Source or Target
ViewA virtual table that derives data dynamically from a query. Slower for complex queries (computed at runtime).Source or Target
SeedSmall CSV-format files that you can write directly in Prophecy.Source only
info

For more information, visit the Databricks documentation on Tables and Views.

Gem configuration

Tables

Tables are persistent, indexed storage objects optimized for frequent access.

Source parameters

ParameterDescription
LocationSpecify the table’s location using database, schema, and name.
PropertiesDefine or infer schema. Add a description if needed.
PreviewLoad a sample of the data before saving.

Target parameters

ParameterDescription
LocationChoose the location where the table will be stored. You can create a new table by writing a new table name.
PropertiesDefine certain properties of the table. The schema cannot be changed for targets.
Write OptionsSelect how you want the data to be written each time you run the pipeline (Table only). Either overwrite or append.
PreviewLoad the data to see a preview before saving.

Views

Views are virtual tables recomputed at runtime from a query.

Source parameters

ParameterDescription
LocationEnter the database, schema, and table (view) name.
PropertiesDefine or infer schema. Add a description if needed.
PreviewLoad data based on the view's underlying query.

Target parameters

ParameterDescription
LocationDefine the name of the view to be created or replaced.
PropertiesDefine certain properties of the table. The schema cannot be changed for targets.
PreviewLoad a preview of the resulting view.
note

Every time the pipeline runs, the target is overwritten. This is because the view is recomputed from scratch based on the underlying logic, and any previously materialized results are discarded. No additional write modes are supported.

Seeds

Seeds are lightweight CSV datasets defined in your project. Seeds are source-only.

ParameterDescription
PropertiesCopy-paste your CSV data and define certain properties of the table.
PreviewLoad a preview of your seed in table format.

Properties

Tables in pipelines do not support dbt properties, which are only applicable to model sources and targets.

Cross-workspace access

If your fabric uses Databricks as the SQL warehouse, you can’t select Databricks in an external Source or Target gem. Instead, you must use Table gems, which are limited to the Databricks warehouse defined in the SQL warehouse connection.

To work with tables from a different Databricks workspace, use Delta Sharing. Delta Sharing lets you access data across workspaces without creating additional Databricks connections.

info

Prophecy implements this guardrail to avoid using external connections when the data can be made available in your warehouse. External connections introduce an extra data transfer step, which slows down pipeline execution and adds unnecessary complexity. For best performance, Prophecy always prefers reading and writing directly within the warehouse.

Reusing and sharing tables

After you create a table in Prophecy, you can reuse its configuration across your entire project. All created tables appear in the Project tab in the left sidebar. To make tables available to other teams, you can share your project as a package in the Package Hub. Other users will be able to use the shared table configuration, provided they have the necessary permissions in Databricks to access the underlying data.