Skip to main content

Knowledge graph indexer

A knowledge graph is an internal index that maps your data environment. Prophecy uses knowledge graphs to help AI agents understand your SQL warehouse structure. The knowledge graph contains metadata about tables, schemas, columns, and data types—not your actual data.

When you interact with AI agents, Prophecy uses the knowledge graph to add context to your prompts. This context helps AI agents generate accurate SQL code that references the correct tables and columns in your warehouse.

Prophecy automatically indexes your data environment when you create a fabric. Afterword, you need to schedule the indexer to run automatically or manually trigger it.

info

Prophecy only indexes tables from your SQL warehouse connection. Datasets from data ingress/egress connections are not included in the knowledge graph.

How indexing works

When triggered, the knowledge graph indexer:

  1. Connects to your SQL warehouse using configured credentials.
  2. Scans catalogs and schemas that the identity has access to in your warehouse connection.
  3. Indexes table names, schemas, column names, data types, and other metadata.
  4. Updates the knowledge graph with this information.

Configure automatic indexing

Configure scheduled crawling to keep your index up-to-date without manual intervention.

  1. In Prophecy, navigate to Metadata > Fabrics.
  2. Select the fabric where you will enable indexing.
  3. Open the Connections tab.
  4. Click the pencil icon to edit the SQL Warehouse Connection.
  5. In the connection dialog, scroll to the Knowledge Graph Indexer tile and toggle on Enable Knowledge Graph Periodic Indexing.
  6. Configure the schedule to run hourly, daily, or weekly.

The schedule must have a defined frequency and timezone. The default timezone is the timezone from where you access Prophecy.

Hourly

ParameterDescriptionDefault
Repeat every ... fromThe interval in hours between pipeline runs, starting at a specific time.
Example: Repeat every 2 hours from 12:00 AM.
Every 1 hour
starting at 2:00 AM

Daily

ParameterDescriptionDefault
Repeat atThe time of day when the schedule will run.
Example: Repeat at 9:00 AM
2:00 AM

Weekly

ParameterDescriptionDefault
Repeat onThe day(s) of the week that the pipeline will run.
Example: Repeat on Monday, Wednesday, Friday
Sunday
Repeat atThe time of the day that the pipeline will run.
Example: Repeat at 9:00 AM
2:00 AM

Manually trigger indexing

You may need to manually trigger indexing if you know that certain tables are missing from the knowledge graph. To do so:

  1. In Prophecy, navigate to Metadata > Fabrics.
  2. Select the fabric where you will enable indexing.
  3. Open the Connections tab.
  4. Click the pencil icon to edit the SQL Warehouse Connection.
  5. Scroll to the Knowledge Graph Indexing Status tile in the connection dialog.
  6. Click Start to reindex the tables and track its progress. You'll be able to view the progress of processed schemas and directories.

If more convenient, you can also start this process from the Environment tab in your project:

  1. Open a project in the project editor.
  2. Attach to the fabric that you wish to reindex.
  3. In the left sidebar, open the Environment tab.
  4. Below your connections, you’ll see a Missing Tables? callout.
  5. Click Refresh to reindex the SQL warehouse.
tip

You might be prompted to manually trigger indexing if the agent can’t locate a table during a conversation.

Add separate authentication for the indexer

Prophecy lets you configure authentication credentials for the knowledge graph indexer separately from pipeline execution credentials, allowing you to control which tables get indexed and how results are scoped for different users.

This means there are two types of credentials stored in a connection:

  • Pipeline Development and Scheduled Execution credentials control how pipelines authenticate when they run.
  • Knowledge Graph Indexer credentials control how the crawler authenticates when it indexes your warehouse on an automated schedule.

If you don't add separate authentication for the indexer, it will use the pipeline development credentials when running.

note

The knowledge graph indexer always uses the same identity as the pipeline development identity if the pipeline development authentication strategy is Personal Access Token (rather than OAuth). This section is not applicable if you use the PAT authentication method.

Prerequisites

Before configuring dedicated credentials for the knowledge graph indexer, you must:

  • Upgrade to Prophecy 4.2.2 or later.
  • Configure your SQL warehouse connection with a Databricks connection. Other SQL warehouses are not supported.
  • Be a Prophecy administrator. Though there are no role-based restrictions for configuring the knowledge graph indexer, you need to understand how authentication works in Prophecy.
  • Be a Databricks administrator. This lets you assign appropriate permissions to the identity that will be used to run the indexer. The identity must have MANAGE access on the assets that you wish to index in the knowledge graph.
caution

The knowledge graph indexing permissions should be equal to or a superset of the pipeline execution permissions. This ensures that the same tables you use in your pipelines are indexed by the knowledge graph. However, Prophecy does not enforce this.

Procedure

To configure the knowledge graph indexer for a fabric:

  1. In Prophecy, navigate to Metadata > Fabrics.

  2. Select the fabric where you will enable indexing.

  3. Open the Connections tab.

  4. Click the pencil icon to edit the SQL Warehouse Connection.

  5. In the dialog, scroll to the Knowledge Graph Indexer tile.

  6. Configure authentication based on your pipeline development authentication method:

    If you use User OAuth for pipeline development:

    • Choose either OAuth (User) or OAuth (Service Principal) for the knowledge graph indexer.

    If you use Service Principal OAuth for pipeline development:

    • You can only use Service Principal OAuth for the knowledge graph indexer.

Recommended for production and scheduled indexing. Credentials don't expire.

  • Configuration: Reuse pipeline development credentials or provide a different Service Principal Client ID and Client Secret.
  • What gets indexed: All tables that the service principal can access.
info

If you use User OAuth for pipeline development, Prophecy enforces user permissions even when the indexer uses service principal credentials. Users only see tables they have permission to access.

User OAuth

User OAuth should only be used for development.

  • Configuration: Uses the same app registration as pipeline development.
  • What gets indexed: All tables that the individual user can access.
  • Limitations: Requires frequent user logins. Scheduled crawling can fail when user credentials expire.