Low Code Spark


You can develop in Visual Editor and Inspect the generated code in Code Editor

Visual Editor

Here is the overview of the visual development screen

Gem Toolbar

Gem toolbar has in-built gems provided by Prophecy.

Gem in Workflow

Gems have input ports and output ports - each has a name - though mostly they’re called in and out. You can connect output port of one gem to the input port of the next gem using an edge. An edge represents the flow of a Spark DataFrame. An output port can have many edges, to connect this output to many inputs. An input can have only one incoming edge.

The gem has a Label on top that is prominent and in color. This can be edited by double-clicking on the gem and renaming in the title of the dialog box. The label is also the name of the function in code to represent that component and should be named thoughtfully. A gem also has a type that is shown below the component in gray.

Mouse over on a gem and you’ll see a . . . on the top right and a play button on the bottom right.

Clicking the . . . menu opens up a few options.

Option Description
Rename You can change the name (label) of the gem (though it is easier to open it and rename).
Change phase Phase is the number that appears in the bottom left of the gem icon and is 0 here. Gems with lower phase will execute before gems with higher phase. This adds ordering between various gems when there is no data flowing between them
Delete Delete this gem
Detailed Stats This will compute more detailed stats for the outgoing data from this gem. This can help figure out partition keys, or see most common values.
Cache This will cache the DataFrame flowing out of this gem, repeated runs of following gems will not execute the steps before this gem, but will use the cached copy instead

Inside Gem Editor

The layout of the gems is often similar. Here is an image followed by explanation

Section Description
Left Panel The left panel has inputs to the gem. Sometimes there is a button to add more input ports for gems with variable number of ports such as MultiJoin or SetOperation (UnionAll)
Port Name This is the name of the input port
Previous Component This is the name of the previous gem whose output in connected to the input of this port.
Input Columns & Data Types For the input ports, the names of input columns and their data types are shown
Selected Column The user can click to select columns, they will often show up on the right panel when clicked
Right Panel This is the business area and covers most of the dialog. Here, it has Target Columns and Expressions
Language This is the language in which you want to see/edit the expressions. This is independent of the language in which the code is stored on Git. Most people prefer to use SQL here.
Expression Builder This is to assist the user to write expressions quickly suggesting in-built functions (and UDFs), column names and operators. The builder is always below the text being types, so can be ignored if it is not help you. Also, pressing escape can make it disappear - though adding more text will bring it back. This works with all languages
Data Drawer Clicking this will pull up a drawer from the bottom that will cover half the dialog. Here, one can see the input and out data for this component without closing the component and going outside.
Unit Test Drawer Clicking this will pull up a drawer from the bottom that will cover half the dialog. Here, one can see the unit tests for this component, edit them and run them - just for the current component

Code Editor

Visual and Code toggle You can toggle between Visual and Code Editors by clicking the toggle button in top-center area of the workflow editor

Then you can use code editor - it is a fully function IDE