Publish-Subscribe Data Projects
One of the best practices during Data-Project development is to Use standardized components. Prophecy enables this standardization in a few ways:
- Datasets
- User-Defined Functions
- Configured Subgraphs
- Configured Pipelines
Pipeline
s, Dataset
s, Subgraph
s, and User-Defined Function
s can now be shared across multiple projects and teams. This allows central Data Platform teams to build reusable code to cover a wide variety of business needs, such as Encryption/Decryption or Identity Masking, and have their "consumers" (the Data Practitioners) take a dependency on that reusable code. Since it's all versioned, when the reusable code changes, the downstream consumers will be notified and can update accordingly.
Data admins can also Create deployment templates for the pipelines that have the best practices baked into them for authorization, notifications, handling of errors, and logging the correct information.
Prophecy allows the Data Platform teams to create and Publish these standards to the various teams.
To share these across projects, they would need to release the Project which has these Pipelines/UDFs defined.
Please refer here for how to release a Project.
Project Dependency
Once a project is released, It can be added as a dependency in other projects.
Adding a dependency while creating Project
Adding a dependency to an existing Project
Using shared Project components
Once dependencies are added to a Project, the User will be able to see all Pipelines/Datasets/Subgraphs available from that project in Project Browser. Please see the below video for an example.
Updating the Dependency Project
If there are changes made to existing Subgraphs/UDFs etc or added new ones, to reflect these in Dependent Projects, the User would need to release the project. After you Release a new version, they would see an option to update Dependency in their project browser. Please see below for an example.
Please see the below links for details on sharing UDFs, Subgraphs, Datasets, and Pipelines.
📄️ Shareable UDFs
Sharable UDFs within the project and to other projects
📄️ Shareable Subgraphs
Sharable Subgraphs within the project and to other projects
📄️ Shareable Pipelines
Shareable Pipelines within the project and to other projects
📄️ Shareable Datasets
Shareable Datasets within the project and to other projects