Lookup

UC Dedicated Cluster 14.3+UC Standard Cluster Not Supported Livy 3.0.1

Lookup gems allow you to mark a particular DataFrame as a Broadcast DataFrame. Spark ensures that this data is available on every computation node so you can perform lookups without shuffling data. This is useful for looking up values in tables.

info

Lookups are implemented as user-defined functions under the hood in Prophecy. Learn about UDF support in Databricks on our documentation on cluster access modes.

Parameters

Parameter	Description
Range Lookup	Whether to perform a lookup with a minimum and maximum value in a column.
Key Columns	One or more columns to use as the lookup key in the source `DataFrame`.
Value Columns	Columns to reference wherever you use this Lookup gem.

Use a Lookup gem

After creating a Lookup gem, you can use the lookup in other gem expressions.

Column-based lookups

Assume you created this Lookup gem with the following configurations:

Lookup UI

To perform a column-based lookup, use:

Python
Scala
SQL

lookup("MyLookup", col("customer_id")).getField("order_category")

lookup("MyLookup", col("customer_id")).getField("order_category")

MyLookup(customer_id)['order_category']

Assume you also have the following Reformat component:

Reformat example

Here, you have a column named category that is set to the value of MyLookup(c_id)['order_category'] in SQL Expression mode. Whatever the value of order_category is for the key found in the c_id column, which is compared to the source customer_id key column, the Lookup gem uses it for a new column.

Literal lookups

You can use any column reference in a Lookup expression. This means that you can use Lookups with static keys:

Python
Scala
SQL

lookup("MyLookup", lit("0000")).getField("order_category")

lookup("MyLookup", lit("0000")).getField("order_category")

MyLookup('0000')['order_category']

This expression evaluates to the value of order_category where customer_id is 0000. This is useful in situations where you want to have a table of predefined keys and their values available in expressions.

Parameters​

Use a Lookup gem​

Column-based lookups​

Literal lookups​

Parameters

Use a Lookup gem

Column-based lookups

Literal lookups