Projects
A project in Prophecy is the core unit for developing, organizing, and deploying data pipelines to production. It includes all the components required to build and execute data processes. Continue reading to learn about:
- Project types for various use cases
- Key components that support pipeline development within a project
- Versioning and publishing of projects
- Sharing projects across teams
- Navigating and editing projects
Project types
Project components depend on your project type. Prophecy supports Python, Scala, and SQL projects. This language choice determines how your visual pipelines will be compiled into code. Your initial choice of project type determines your project's capabilities and workflow structure, so it's important to choose correctly at the start since it cannot be changed later.
Prophecy supports both SQL and Spark projects, and the choice depends on your data needs. Many organizations use both types of projects, leveraging SQL for data analytics and Spark for data engineering. Prophecy provides a platform where this can happen all in one place.
To make project creation easier, Prophecy provides project creation templates that can cater to different needs. We suggest using the Prophecy for Analysts template when creating SQL projects for data analytics.
- SQL
- Python and Scala (Spark)
SQL is ideal when working with structured data in warehouses like Snowflake or Databricks SQL, offering simplicity, speed, and efficiency for moderate data volumes and interactive queries. It’s best for teams who need straightforward transformations without managing distributed infrastructure.
Spark excels in executing complex pipelines and processing semi-structured data. Spark prioritizes performance and scalability are key, and requires more data engineering knowledge. Depending on what your data engineers are comfortable with, they can either choose Python or Scala as the backend code of Spark projects.
Components
Project components vary based on project type. While SQL and Spark projects share common elements like pipelines and gems, their functionality differs significantly. Each is documented separately: Pipeline development for Analysts focuses on SQL projects, while Pipeline development for Engineers focuses on Spark projects.
- SQL
- Python and Scala (Spark)
Component | Description |
---|---|
Pipelines | Sequences of steps that run on Prophecy Automate and SQL warehouses. |
Gems | Representations of individual data transformation steps in a pipeline or model. |
Tables | SQL tables, views, or seeds. |
Functions | SQL macros used in gem expressions. |
Tests | Automated validations ensuring referential integrity, data consistency, and other quality checks. |
Schedules | Schedules for periodic pipeline execution managed by Prophecy Automate. |
Models | SQL transformations that define a single table or view. Models only appear in projects that enable Normal or Fork per User Git storage models. (Only applicable for data engineers.) |
Component | Description |
---|---|
Pipelines | Sequences of steps that run on Spark-native code. |
Datasets | Pointers to tables that are stored in the external data provider defined in a fabric. |
Jobs | Schedules for pipeline execution managed by external orchestration tools like Databricks Jobs and Airflow. |
Gems | Representations of individual data transformation steps in a pipeline. |
Versioning
All projects are automatically compiled into code and hosted on Git for powerful version control. Prophecy offers several version control options, which you can configure during project creation. The available options vary depending on the project type.
- SQL
- Python and Scala (Spark)
Parameter | Options |
---|---|
Git Account |
|
Git Storage Model |
|
Parameter | Options |
---|---|
Git Account |
|
Git Storage Model |
|
Access and sharing
In Prophecy, projects are tied to a specific team. This assignment dictates the project's ownership and edit permissions.
When you begin using Prophecy, you are added to your own one-person team. Personal teams are ideal when you want to keep projects private and accessible only to yourself. Your team administrator will typically create shared teams.
Team ownership
Only members of the team assigned to a project have permission to modify its components (pipelines, gems, etc.).
Sharing with other teams (read-only)
To extend the reach of your project, you can share it with other teams.
- Read-only access: Users from other teams cannot directly edit the original project's components.
- Reuse components: If you share the project with other teams and publish it to the Package Hub, users can import the projects as packages in their own projects. While they can't edit the original components, they can use copies of them in their own projects.
- Run pipelines: If you share projects that contain business apps with other teams, users can execute business apps that rely on the pipelines within the shared project.
Metadata
The Metadata page in Prophecy provides a searchable directory of projects and project components including pipelines, models, and jobs. All projects that are shared with your teams are visible in the Projects tab of the Metadata page. You can click into each project to access more granular metadata about that project.
You can view and edit the following metadata for your projects:
Metadata | Description |
---|---|
About | An overview of your project and space for an in-depth description of the project. |
Content | A list of entities within the project like pipelines and jobs depending on your project type. |
Dependencies | The dependencies that exist in the project, including packages and Prophecy libraries. |
Version control | Either the Git workflow of the project, or the version history of the project, depending on your project type. |
Deployments | A list of project versions that you have released and/or deployed (published). |
Access | The teams that can view your project via the Package Hub. |
Settings | Different configuration options for building and deploying your project. |
Project editor
To begin pipeline development, open your project from the IDE tab in the sidebar. This opens the project editor, where you can configure transformation gems and interactively run pipelines. To learn more about pipeline development, visit SQL pipeline development and Spark pipeline development.
If you want to change the name of your project, you must do so in the project metadata (not the project editor).
What's next
To continue learning about projects:
- Create multiple projects and compare different project types.
- Follow one of our tutorials to build a project from end-to-end.
- Play with different project components to understand how they interact.