Deployment

Overview

Prophecy Deployment Modes

Prophecy supports multiple deployment modes. Some of our customers use Prophecy SaaS version, where they sign-in and connect to their existing Spark account. Other customers are large Enterprises and require custom deployments.

Prophecy SaaS

Prophecy SaaS currently supports Databricks. The sign up process is pretty self explanatory, and all you need to do it to add

  • Email
  • Pointer to Databricks - URL, Token

Here are the primary interactions

  • Prophecy will spin up a Databricks cluster for development on demand, while you’re developing workflows.
  • Your data stays within your network and so does the compute, so this is fairly secure
  • Any secrets you create are also stored in Databricks using dbutils, and Prophecy does not store them

Use the following screen to sign up, there is a link on the top right for instructions:

Prophecy Enterprise

Prophecy Enterprise is deployed within the customer network, ideally on a Kubernetes cluster on-premises or on a public cloud. Prophecy integrates with rest of your infrastructure. Here is a logical model of how Prophecy deploys

Deeper Dive

Prophecy will install in your private cloud or within your network in the public cloud. Here are the primary integration points

  1. Spark: Prophecy can connect in two ways
    • Livy: Livy Server is present on a majority of platforms and this includes cloud providers such as EMR, or on-premises vedors such as Cloudera, Hortonworks and MapR.
    • API: Databricks provides an API using a public URL, to be accessed via a token, and works well with Prophecy. We also use this API for other tasks such as listing files in DBFS.
  2. Identity: For Enterprises, having identity connection is a must-have and we connect to Active Directory, and LDAP and use the user connections. For Spark on Hadoop clusters that have kerberos, the pass this user identity into the Hadoop cluster for jobs spun up by the user.
  3. Git: We connect to the customer Git - now this can be Github Enterprise, Bitbucket, GitLab or any other Git provider.
  4. CI/CD: If you’re using Prophecy Airflow for scheduling, we will take care of the CI/CD pipeline.
    • However, many Enterprises have an existing Enterprise scheduler (more prevalent on premises). In this case, you can just use Prophecy for development and use Prophecy to commit to Git. Your CI/CD pipeline can then be triggered from this commit. Prophecy developed tests are often run in this pipeline as well.