In prophecy, a fabric is an execution environment. This is where you configure your spark endpoint details, spark credentials etc. This page will walk you through how to configure your fabrics.
Finding your Fabrics
You can find your fabrics on the metadata screen.
Data Plane Config
You should leave it as it is. This is supposed to contain the url of a prophecy service.
Here you configure the endpoints of your spark setup. We support three modes, each of which has different configuration fields.
|Provider||This is your spark provider type. You can choose Databricks, Livy or EMR mode here. Based on this field, the other fields will change.|
Here you configure your credentials for your spark setup. This is the databricks token in case you’re on databricks, and
the AWS credentials in case you’re on EMR. This is separated from the Spark Config, because typically you have one endpoint
for multiple people, but different people have different credentials to access the same endpoint. For that reason,
you specify Spark Config once for the fabric, but each user has to provide their own credentials on the
User Config tab.
Here you configure job sizes for your fabric. The idea is to use human-readable names (like XS, S, M, L, XL etc) to abstract out the configurations of cores + memory.
How to Configure a Fabric for Databricks
The endpoint and other configurations are available in the Spark Config Tab
|Cloud||Whether your databricks is on azure or aws or community.|
|URL||The databricks workspace url|
|User Agent||You can leave this empty. Fill this if you want to send a custom user agent for databricks API calls. Otherwise we send
|Long Living Cluster ID||This can also be left empty for now.|
|Version||The databricks runtime version you want your clusters to have. We support 6.x, 7.x and 8.x versions|
|Auto Termination Timeout||The clusters you spin up using prophecy will auto terminate after being idle for this much time|
|MachineType||Machine Type for your databricks clusters. The names will be cloud dependent. As an example, you can use
|AWS Instance Profile ARN||Use this field to attach an instance profile to your clusters' machines. Databricks Doc on Instance Profiles. Only applicable for AWS Databricks and not for Azure|
|AWS Glue||If you’re on AWS Databricks, you can ask to use glue as the catalog for your clusters by enabling this|
The token needs to be provided in the User Config tab. Each logged-in user needs to provide their own token here.
Other fields can be safely ignored.
How to Configure a Fabric for Livy
If you have your own spark installation with a Livy on top of it, Prophecy can connect to that Livy.
|URL||URL of the Livy Service|
|Auth||By default Livy has no auth. So this can be left at None. We support Kerberos as another auth mode. Kerberos setup is a little more involved and you can contact firstname.lastname@example.org for further help.|
|Executor Cores + Memory||Here you configure the size of one executor. Your jobs will use this while asking the ClusterManager for executors|
|Impersonation||When it’s off, livy will create all sessions as
|Prophecy Jar||Path to prophecy libs jar. This will typically be on a distributed storage like HDFS or cloud based object-stores (S3/Blob). Prophecy Team will provide the jar to you, you can upload it to a file store of your choice and put that path in this field|
In most cases you do not need to provide anything in this tab. You can provide your cloud credentials in case you’d like livy to access s3/blob-store.
How to Configure a Fabric for EMR
If you are on EMR, we support that too. Behind the covers, we simply connect to the Livy running on your EMR, so your EMR needs to have livy enabled.
|Bucket||Provide a bucket the EMR cluster has access to. Here we need just the name of the bucket, not the fully qualified path.|
|Prophecy Jar||Provide the path to the prophecy libs jar. This will typically be on the same bucket mentioned above. Prophecy team will provide the jar to you. You can upload it on this bucket and provide full jar path in this field.|
|Long Living Cluster Id||EMR Cluster ID that you want to connect to Prophecy. Also called
|Version||Version of your EMR cluster|
|Subnet Id||The subnet your EMR cluster is part of|
|Machine Type||Machine Type of Executor Machines|
|EMR Role||AWS Doc|
|EC2 Instance Profile||AWS Doc|
Here you need to provide your AWS credentials, so that Prophecy can connect to this EMR as you. As mentioned above, this step needs to be done by all the users.