Data Copilot AI capabilities
Prophecy Data Copilot provides suggestions from an AI model as you develop your data Pipelines and Models. You can view and incorporate suggestions directly within the Prophecy visual editor and code editor. Data Copilot makes suggestions for your entire Pipeline, for a single Gem (transformation), and even for individual expressions within each Gem.
Supported AI capabilities by engine
Data Copilot supports the following capabilities for the Spark and SQL engines:
AI Capability | Spark | SQL |
---|---|---|
Start a new Pipeline (Spark) or Model (SQL) | ||
Modify an existing Pipeline (Spark) or Model (SQL) | ||
Suggest Gems | ||
Suggest Expressions | ||
Generate with AI, Scripts | ||
Generate with AI, Functions | ||
Map with AI | ||
Code with AI | ||
Fix with AI, Gems | ||
Fix with AI, Expressions | ||
Auto Documentation | ||
Data Tests and Quality Checks |
Text to Pipelines
Get started on a new Pipeline quickly by typing your prompt into the text box and Data Copilot will generate a new Pipeline or modify an existing one.
Start a new Pipeline
You can use Data Copilot to start a new Pipeline by typing a simple English text prompt.
The following example uses Data Copilot to help start a Pipeline:
- Type a prompt with English text, such as
Which customers shipped the largest orders this year?
- Data Copilot uses metadata from the accessible Datasets, Seeds, Pipelines, and Models, to create a Knowledge Graph.
- Data Copilot creates the Pipeline based on the text prompt, using the Knowledge Graph as the context. This Pipeline is accessible in both the visual editor and the code editor.
- If you'd like, review the suggested changes before you decide to keep or reject the suggested Pipeline. Then interactively execute it to see the results.
- View Data Copilot's suggested changes in the visual editor.
Modify an existing Pipeline
You can also call Data Copilot to modify an existing Model. Type a new text prompt, and Data Copilot will suggest a new sequence of data transformations. You don't necessarily have to select where you want to make your modification for Data Copilot to make its suggestion.
Added/updated Gems are highlighted in yellow.
Next-transformation suggestions
Data Copilot can suggest the next transformation in a series or the next expression within a Gem.
Suggest Gems
Data Copilot can suggest the next transformation for Leaf Nodes in a graph.
See the following Join suggestion example:
- Select and drop a Dataset of interest on the canvas.
- Data Copilot suggests Datasets which are frequently used with the selected Dataset.
- Data Copilot then suggests a next transformation, in this case, a Join Gem.
Suggest Expressions
As we continue development within Gems, Data Copilot can suggest expressions within Gems.
Within our advanced Expression Builder you can:
- Type an English text prompt.
- Data Copilot generates a code expression for a particular column.
- Review the code expression, and if you'd like, try again with a different prompt.
- Run the Pipeline up to and including this Gem, and observe the resulting data sample.
Generate with AI
Data Copilot can generate script Gems, user-defined functions in Spark, or macro functions in SQL.
Map with AI
You don't have to worry about mapping the schema across your Model. Data Copilot will map the target schema with the existing Gems and Datasets.
Code with AI
In addition to the visual editor above, you'll also see code completion suggestions in the code editor.
Data Copilot helps you build your Model in the code interface by making predictions as you type your code. And when you go back to the visual interface, you'll see your code represented as a Model.
Fix with AI
If there are any errors in your Gems, perhaps introduced upstream without your knowledge, Data Copilot will automatically suggest one-click fixes.
The Fix with AI option appears on the diagnostic screen where you see the error messages or directly with the expression itself.
Auto Documentation
Understanding data assets is much easier with Data Copilot’s auto-documentation. Data Copilot delivers summary documentation suggestions for all Datasets, Pipelines, Models, and Orchestrations.
Explain Gems
Here Data Copilot provides a high-level summary of a Pipeline and more detailed description of each Gem.
Describe Datasets and Metadata
How did a Dataset change? Data Copilot recommends a description of the change for every edit you make. How was a column computed? Data Copilot suggests a plain English description that explains data sources and how every column is generated and what it represents.
This is a big time saver! You can edit the documentation suggestions and commit them to your repository.
Write Commit Messages and Release Notes
Data Copilot auto-documents anywhere you need it - from the granular data sources and columns to Gem labels, all the way to project descriptions. Copilot even helps you write commit messages and release notes.
Data Tests and Quality Checks
Unit tests and data quality checks are crucial for Pipeline and Job productionalization, yet many teams leave little time to develop these tests or worse, don’t build them at all. With Data Copilot, you’ll have one or more suggested unit tests that can be seamlessly integrated into your CICD process.
Data Copilot also suggests data quality checks based on the data profile and expectations.