Skip to main content

77 docs tagged with "gems"

View all tags

Aggregate

Group data and apply aggregation methods or pivot operations

Avro

Parameters and properties to read from and write to Avro files

BigQuery

Parameters and properties to read from and write to the BigQuery warehouse

CosmosDB

Parameters and properties to read from and write to the CosmosDB warehouse

CSV

Parameters and properties to read from and write to CSV files

DataCleansing

Standardize data formats and address missing or null values in the data

Deduplicate

Remove rows with duplicate values of specified columns

Delta

Parameters and properties to read from and write to Delta files

Delta Table

Read from or write to tables managed by a Delta table metastore

Directory

Return a listing of all the files in a specified directory

DynamicReplace

Dynamically generate values depending on certain conditions

DynamicSelect

Dynamically filter columns of your dataset based on a set of conditions

DynamicSelect

Dynamically filter columns of your dataset based on a set of conditions

Filter

Filter your data based on a custom filter condition

Fixed Format

Parameters and properties to read from and write to Fixed Format files

Functions

Build functions with SQL macros to be used in gem expressions

FuzzyMatch

Identify non-identical duplicates in your data

Gems

Power your pipelines with gems

Gems

Transform your data with Prophecy gems

Hive Table

Read from or write to tables managed by a Hive metastore

Iceberg

Read from or write to tables managed by Iceberg

JDBC

Parameters and properties to read from and write to the JDBC warehouse

Join

Join two or more datasets

Join

Join one or more DataFrames on conditions

JSON

Parameters and properties to read from and write to JSON files

Kafka

Parameters and properties to read from and write to Kafka files

Limit

Limit the number of columns processed

Limit

Limit the number of rows

Macro

Use dbt macros in your pipelines

MongoDB

Parameters and properties to read from and write to the MongoDB warehouse.

ORC

Parameters and properties to read from and write to ORC files

OrderBy

Sort your data based on one or more columns

Parquet

Parameters and properties to read from and write to Parquet files

Redshift

Parameters and properties to read from and write to the Redshift warehouse.

Reformat

Use expressions to reformat column names and values

Reformat

Select one or more columns or values using expressions and functions

RestAPI

Call APIs from your pipeline.

RestAPIEnrich

Enrich DataFrame with content from rest API response based on configuration

SampleRows

Sample records by choosing a specific number or percentage of records

Seed

Parameters and properties to read from Seed files

Snowflake

Parameters and properties to read from and write to the Snowflake warehouse.

SQL Gems

Gems are data seeds, sources, transformations, and targets

Text

Parameters and properties to read from and write to Text file

Unpivot

Use the Unpivot gem to transform your data from a wide format to a long format

Upload files

Learn how to upload files to your Spark pipeline

Window

Create moving aggregations and transformation

XLSX (Excel)

Parameters and properties to read from and write too XLSX (Excel) files

XML

Parameters and properties to read from and write to XML files