Skip to main content

Publish-Subscribe Data Projects

One of the best practices during Data-Project development is to Use standardized components. Prophecy enables this standardization in a few ways:

  1. Datasets
  2. User-Defined Functions
  3. Configured Subgraphs
  4. Configured Pipelines

Pipelines, Datasets, Subgraphs, and User-Defined Functions can now be shared across multiple projects and teams. This allows central Data Platform teams to build reusable code to cover a wide variety of business needs, such as Encryption/Decryption or Identity Masking, and have their "consumers" (the Data Practitioners) take a dependency on that reusable code. Since it's all versioned, when the reusable code changes, the downstream consumers will be notified and can update accordingly.

Data admins can also Create deployment templates for the pipelines that have the best practices baked into them for authorization, notifications, handling of errors, and logging the correct information.

Prophecy allows the Data Platform teams to create and Publish these standards to the various teams.

To share these across projects, they would need to release the Project which has these Pipelines/UDFs defined.

Please refer here for how to release a Project.

Project Dependency

Once a project is released, It can be added as a dependency in other projects.

Adding a dependency while creating Project

Adding a dependency to an existing Project

Using shared Project components

Once dependencies are added to a Project, the User will be able to see all Pipelines/Datasets/Subgraphs available from that project in Project Browser. Please see the below video for an example.

Updating the Dependency Project

If there are changes made to existing Subgraphs/UDFs etc or added new ones, to reflect these in Dependent Projects, the User would need to release the project. After you Release a new version, they would see an option to update Dependency in their project browser. Please see below for an example.

Please see the below links for details on sharing UDFs, Subgraphs, Datasets, and Pipelines.