Skip to main content

Projects and Git

Project is the primary unit of development and deployment to production in Prophecy.

A project contains

  • Data Dipelines that read, transform and write data using Spark
  • Datasets point to the data that is read and written to by the Data Pipelines
  • Jobs run Data Pipelines based on a schedule

Project is Code on Git

A project is code on Git. This means that within a project, the business logic of all the assets including Pipelines, Datasets, and Jobs is stored as code on Git. This might be a repository on Github or a folder in a repository.

Project is code

Project Metadata

The Project Metadata page provides several views about the various aspects of your project. You can get to this view by going to the Metadata page and clicking the name of your project.

Project Metadata

NameDescription
1Metadata tabsSwitch between the various metadata views available for this project
2Rename projectChange the name of the project. This only affects the name in the UI, no code directories will be renamed.
3Project descriptionEdit the project description
4Project languageOutput language for this project
5Project Git statusThis is a list of branches that you or other users have used while developing this project, as well as the number of uncommitted files in that branch.

If you hover over any of the branches you'll have the option to commit the changes in that branch or delete the branch altogether.

Branch hover

Project Relations

The Relations tab on the Project Metadata page shows a list of all the component pieces that belong to this Project.

Project Relations

NameDescription
1PipelinesList of Pipelines in this project
2Add PipelineClick this to add a new Pipeline
3JobsList of Jobs in this project
4Add JobClick this to add a new Job
5DatasetsList of Datasets in this project
6SubgraphsList of published Subgraphs in this project

Project Commits

The Commits tab on the Project Metadata page shows the current Git state of the project and allows you to step through the process of committing, merging, and releasing your chanages.

Project Commits

NameDescription
1Current BranchThe branch you're currently working on
2Base BranchThe base (or upstream) branch you're comparing the Current Branch to
3Commit ButtonClick this to commit any uncommitted files in this branch
4Unmerged commitsThe number of commits that are in your Current Branch but are not yet in the Base Branch
5Uncommitted filesThe number of files that are uncommitted in this branch. Use the commit button to save these changes into the Current Branch
6Remote commitsIf you or another user has merged their changes into the Base Branch you can see the number here. Use the pull button to bring these changes into your Current Branch.

For a walkthrough of the different phases of comitting a project, see this section.

Development and Deployment

Prophecy provides a standard and recommended mechanism for using Git based development (though other mechanisms are possible - including fork based development in our Enterprise product). A standard development pattern looks like this:

Project deploy

Here are the steps explained:

1. Create new project

Starting from the Create Entity page, click Project.

Create Entity page

In the Create Project pane you can set the name, output language (Scala or Python) and which team the project belongs to. It is strongly recommended that you connect to your Git repository to ensure that there is a secure copy of the code that you have direct access to.

New project

caution

It is not currently possible to switch the output language of a project after it has been created. Please choose the appropriate language for your environment.

2. Create, edit and commit the Pipeline

When you create a new Pipeline, you have to choose the branch where it will be created - an existing one or a new one.

Then you will develop this Pipeline - you will make changes and commit them in this branch multiple times. The commit dialog opens when you click the bottom bar - orange color indicates uncommitted changes. When you commit, your changes are preserved in Git and are pushed to your branch.

Commit

3. Integrate changes

The four main phases of integrating your changes are: Commit, Pull, Merge, Release. Let's go over each in detail.

Commit

A Commit represents changes to one or more files in your Project. They are what allow you to keep and view the history of all the changes that have happened while developing your Pipelines. You can create a commit using either the Project Commits page or within the Pipeline editor itself. Committing the files saves the changes you've been working on into your Branch and pushes those changes to your Git repository so that it's safely stored.

When committing from the Project Commits page, you'll see the following:

Project commit page

NameDescription
1Change logThis is a log of all the changes that have been made to (or merged into) the Current Branch
2Changed filesThis is a list of all of the changed files that will be committed
3ResetIf you need to reset all changes that have happened since the last commit, click this button
4Commit messageThe message to include as part of the commit

Pull

Pull brings changes that have occurred in remote Branches into the Prophecy-local branches. If you have any upstream changes that need to be pulled into the local branches you'll see the following:

Project pre pull page

Click the button to pull the changes and you'll see the Pull view:

Project pull view

Merge

Merge will take the changes in the Current Branch and merge them into the Base Branch. Your changes will become part of the Base Branch and will be available to anyone else who's work is based on the Base Branch. It is steps 3 and 5 of this diagram.

Project merge

Click the Merge button to merge the changes and push them back to your Git repository.

Release

Release tags a particular commit in the Base Branch with a user-specified version (step 6 in this diagram). This allows you designate a new version as ready for production, or inform users who may be subscribed to Datasets defined within your Project that there might be changes in the published Dataset.

Project release

NameDescription
1Commit selectionPick which commit will be tagged for release
2Release notesFree-form notes for the release
3VersionEnter whatever you'd like here. Best practices exist such as Semantic Versioning, but you're free to use whatever matches your environment