Skip to main content

Data Copilot AI capabilities

Prophecy Data Copilot provides suggestions from an AI model as you develop your data Pipelines and Models. You can view and incorporate suggestions directly within the Prophecy visual editor and code editor. Data Copilot makes suggestions for your entire Pipeline, for a single Gem (transformation), and even for individual expressions within each Gem.

Supported AI capabilities by engine

Data Copilot supports the following capabilities for the Spark and SQL engines:

AI CapabilitySparkSQL
Start a new Pipeline (Spark) or Model (SQL)TickTick
Modify an existing Pipeline (Spark) or Model (SQL)TickTick
Suggest GemsTickTick
Suggest ExpressionsTickTick
Generate with AI, ScriptsTickTick
Generate with AI, FunctionsTickTick
Map with AITickTick
Code with AITickTick
Fix with AI, GemsTickTick
Fix with AI, ExpressionsTickTick
Auto DocumentationTickTick
Data Tests and Quality ChecksTickTick

Text to Pipelines

Get started on a new Pipeline quickly by typing your prompt into the text box and Data Copilot will generate a new Pipeline or modify an existing one.

Start a new Pipeline

You can use Data Copilot to start a new Pipeline by typing a simple English text prompt.

Start a Pipeline

The following example uses Data Copilot to help start a Pipeline:

  1. Type a prompt with English text, such as Which customers shipped the largest orders this year?
  2. Data Copilot uses metadata from the accessible Datasets, Seeds, Pipelines, and Models, to create a Knowledge Graph.
  3. Data Copilot creates the Pipeline based on the text prompt, using the Knowledge Graph as the context. This Pipeline is accessible in both the visual editor and the code editor.
  4. If you'd like, review the suggested changes before you decide to keep or reject the suggested Pipeline. Then interactively execute it to see the results.
  5. View Data Copilot's suggested changes in the visual editor.

Modify an existing Pipeline

You can also call Data Copilot to modify an existing Model. Type a new text prompt, and Data Copilot will suggest a new sequence of data transformations. You don't necessarily have to select where you want to make your modification for Data Copilot to make its suggestion.

Added/updated Gems are highlighted in yellow.

Next-transformation suggestions

Data Copilot can suggest the next transformation in a series or the next expression within a Gem.

Suggest Gems

Data Copilot can suggest the next transformation for Leaf Nodes in a graph.

Suggest Gems

See the following Join suggestion example:

  1. Select and drop a Dataset of interest on the canvas.
  2. Data Copilot suggests Datasets which are frequently used with the selected Dataset.
  3. Data Copilot then suggests a next transformation, in this case, a Join Gem.

Suggest Expressions

As we continue development within Gems, Data Copilot can suggest expressions within Gems.

Suggest expressions

Within our advanced Expression Builder you can:

  1. Type an English text prompt.
  2. Data Copilot generates a code expression for a particular column.
  3. Review the code expression, and if you'd like, try again with a different prompt.
  4. Run the Pipeline up to and including this Gem, and observe the resulting data sample.

Generate with AI

Data Copilot can generate script Gems, user-defined functions in Spark, or macro functions in SQL.

Map with AI

You don't have to worry about mapping the schema across your Model. Data Copilot will map the target schema with the existing Gems and Datasets.

Code with AI

In addition to the visual editor above, you'll also see code completion suggestions in the code editor.

Data Copilot helps you build your Model in the code interface by making predictions as you type your code. And when you go back to the visual interface, you'll see your code represented as a Model.

Fix with AI

If there are any errors in your Gems, perhaps introduced upstream without your knowledge, Data Copilot will automatically suggest one-click fixes.

The Fix with AI option appears on the diagnostic screen where you see the error messages or directly with the expression itself.

Auto Documentation

Understanding data assets is much easier with Data Copilot’s auto-documentation. Data Copilot delivers summary documentation suggestions for all Datasets, Pipelines, Models, and Orchestrations.

Explain Gems

Here Data Copilot provides a high-level summary of a Pipeline and more detailed description of each Gem.

Describe Datasets and Metadata

How did a Dataset change? Data Copilot recommends a description of the change for every edit you make. How was a column computed? Data Copilot suggests a plain English description that explains data sources and how every column is generated and what it represents.

This is a big time saver! You can edit the documentation suggestions and commit them to your repository.

Write Commit Messages and Release Notes

Data Copilot auto-documents anywhere you need it - from the granular data sources and columns to Gem labels, all the way to project descriptions. Copilot even helps you write commit messages and release notes.

Data Tests and Quality Checks

Unit tests and data quality checks are crucial for Pipeline and Job productionalization, yet many teams leave little time to develop these tests or worse, don’t build them at all. With Data Copilot, you’ll have one or more suggested unit tests that can be seamlessly integrated into your CICD process.

Data Copilot also suggests data quality checks based on the data profile and expectations.