Low Code Spark

Fabrics

In prophecy, a fabric is an execution environment. This is where you configure your spark endpoint details, spark credentials etc. This page will walk you through how to configure your fabrics.

Overview

Finding your Fabrics

You can find your fabrics on the metadata screen.

Data Plane Config

You should leave it as it is. This is supposed to contain the url of a prophecy service.

Spark Config

Here you configure the endpoints of your spark setup. We support three modes, each of which has different configuration fields.

Option Description
Provider This is your spark provider type. You can choose Databricks, Livy or EMR mode here. Based on this field, the other fields will change.

User Config

Here you configure your credentials for your spark setup. This is the databricks token in case you’re on databricks, and the AWS credentials in case you’re on EMR. This is separated from the Spark Config, because typically you have one endpoint for multiple people, but different people have different credentials to access the same endpoint. For that reason, you specify Spark Config once for the fabric, but each user has to provide their own credentials on the User Config tab.

Job Sizes

Here you configure job sizes for your fabric. The idea is to use human-readable names (like XS, S, M, L, XL etc) to abstract out the configurations of cores + memory.

How to Configure a Fabric for Databricks

Spark Config

The endpoint and other configurations are available in the Spark Config Tab

Option Description
Cloud Whether your databricks is on azure or aws or community.
URL The databricks workspace url
User Agent You can leave this empty. Fill this if you want to send a custom user agent for databricks API calls. Otherwise we send User-Agent: Prophecy for all databricks REST API calls
Long Living Cluster ID This can also be left empty for now.
Version The databricks runtime version you want your clusters to have. We support 6.x, 7.x and 8.x versions
Auto Termination Timeout The clusters you spin up using prophecy will auto terminate after being idle for this much time
MachineType Machine Type for your databricks clusters. The names will be cloud dependent. As an example, you can use Standard_DS3_v2 for Azure, i3.xlarge for AWS
AWS Instance Profile ARN Use this field to attach an instance profile to your clusters' machines. Databricks Doc on Instance Profiles. Only applicable for AWS Databricks and not for Azure
AWS Glue If you’re on AWS Databricks, you can ask to use glue as the catalog for your clusters by enabling this

User Config

The token needs to be provided in the User Config tab. Each logged-in user needs to provide their own token here.

Other fields can be safely ignored.

How to Configure a Fabric for Livy

Spark Config

If you have your own spark installation with a Livy on top of it, Prophecy can connect to that Livy.

Option Description
URL URL of the Livy Service
Auth By default Livy has no auth. So this can be left at None. We support Kerberos as another auth mode. Kerberos setup is a little more involved and you can contact support@prophecy.io for further help.
Executor Cores + Memory Here you configure the size of one executor. Your jobs will use this while asking the ClusterManager for executors
Impersonation When it’s off, livy will create all sessions as livy user. When this is on, livy will create sessions as the end user
Prophecy Jar Path to prophecy libs jar. This will typically be on a distributed storage like HDFS or cloud based object-stores (S3/Blob). Prophecy Team will provide the jar to you, you can upload it to a file store of your choice and put that path in this field

User Config

In most cases you do not need to provide anything in this tab. You can provide your cloud credentials in case you’d like livy to access s3/blob-store.

How to Configure a Fabric for EMR

If you are on EMR, we support that too. Behind the covers, we simply connect to the Livy running on your EMR, so your EMR needs to have livy enabled.

Spark Config

Option Description
Bucket Provide a bucket the EMR cluster has access to. Here we need just the name of the bucket, not the fully qualified path.
Prophecy Jar Provide the path to the prophecy libs jar. This will typically be on the same bucket mentioned above. Prophecy team will provide the jar to you. You can upload it on this bucket and provide full jar path in this field.
Long Living Cluster Id EMR Cluster ID that you want to connect to Prophecy. Also called Job Flow Id
Version Version of your EMR cluster
Subnet Id The subnet your EMR cluster is part of
Machine Type Machine Type of Executor Machines
EMR Role AWS Doc
EC2 Instance Profile AWS Doc

User Config

Here you need to provide your AWS credentials, so that Prophecy can connect to this EMR as you. As mentioned above, this step needs to be done by all the users.