Fabrics
Prophecy helps you develop data pipelines in high-quality Spark or SQL code—but what does Prophecy use to compute these pipelines? The first thing to understand before building any pipeline is that your pipeline must be connected to an execution environment.
This is why fabrics exist in Prophecy. Fabrics let Prophecy connect to specific execution environments and data storage.
Use case
Here is one way you might set up your fabrics. First, the team admin creates:
- A team named Marketing_DSS for the Marketing Decision Support System users.
- A
dev
fabric for development activities that specifies the Marketing_DSS team. - A
prod
fabric for production pipelines that specifies the Marketing_DSS team.
In this example, all users in the Marketing_DSS Team will have access to the dev
and prod
fabrics.
Components
Fabrics include everything required to run a data pipeline. As an example, the following table describes the components of a Databricks Spark fabric.
Component | Description |
---|---|
Connection Credentials | Includes details like Workspace URL and Access Token for Databricks. |
Cluster Configuration | Defines settings such as Databricks Runtime Version, Machine Type, and Idle Timeout. |
Job Sizes | Lets you define reusable cluster sizes (e.g., an XL cluster with 10 i3.xlarge servers, 40 CPUs, and 70GB memory). |
Scheduler | Executes Spark data pipelines on a defined schedule, such as weekdays at 9:00 AM. Databricks provides a default scheduler, and an Airflow Scheduler is available for enterprise users. |
Database Connections | Supports connections to databases (MySQL, Postgres) and data warehouses (Snowflake) via JDBC or other protocols. Credentials are securely stored on the fabric for reuse. |
Metadata Connection | Enhances fabric management for large datasets, useful for users handling hundreds or thousands of tables. Learn more. |
Credentials & Secrets | Securely stores credentials in Databricks using Personal Access Tokens (PAT) or Databricks OAuth. Secrets are stored as key-value pairs, accessible only to running workflows. |
Fabric creation
A team admin typically sets up fabrics. Detailed steps for fabric creation can be found in the Set up Spark fabrics and Set up SQL fabrics sections of the documentation.
Even though teams share fabrics, each user must add their individual credentials to be able to use the fabric in their projects.
Prophecy provides a trial Prophecy-managed fabric that can get you started with building your pipelines. However, you will need to connect to external execution environments for your production workflows.
Fabric usage
When you create a fabric, you define the team that owns the fabric. If you are a member of that team, you will be able to use the fabric. To attach a fabric to a project:
- Open a project from the Prophecy metadata page.
- Open a pipeline or model that you want to work on.
- Expand the Attach Cluster menu. This menu will differ slightly between Spark and SQL projects.
- Select a fabric. You will be shown fabrics that have the same data provider as your project (e.g., Databricks).
- Attach to a cluster or create a new cluster.
- Run your pipeline or model. This executes the data transformation on the environment defined in the fabric!
Fabric metadata
A list of all fabrics available to you can be found in the Fabrics tab of the Metadata page.
You can click into each fabric to access the fabric settings. These will resemble the settings that appear during fabric creation.
Hands-on
Get started with hands-on guides. Learn step by step how to connect to your execution engine by creating a fabric:
- Create a SQL fabric with a JDBC or Unity Catalog connection following this guide.
- Create a Databricks fabric following these steps.
- Create an EMR fabric with Livy step by step here.