Source & Target
Constitutes the set of Gems that help with loading and saving data.
File
A collection of Gems related to working with various file-based formats.
Name | Description |
---|---|
Avro | Avro format is a row-based storage format for Hadoop, which is widely used as a serialization platform. |
CSV | Allows you to read or write a delimited file (often called Comma Separated File, CSV). |
Delta | Reads data from Delta files present at a path and writes Delta files to a path based on configuration. |
Fixed Format | Read data from fixed format files with expected schema, or write data to fixed format files with expected schema. |
FTP | Allows you to read or write files (csv, text and binary) on a remote location. |
Iceberg | Reads data from Iceberg files present at a path and writes Iceberg files to a path based on configuration. |
JSON | Allows you to read or write a delimited file (often called Comma Separated File, CSV). |
Kafka | This source currently connects with Kafka Brokers in Batch mode. |
ORC | ORC (Optimized Row Columnar) is a columnar file format designed for Spark/Hadoop workloads. |
Parquet | Parquet is an open source file format built to handle flat columnar storage data formats. |
Text | This Gem allows you to read from or write to text file. |
XLSX (Excel) | Allows you to read or write Excel-compatible files. |
Warehouse
A collection of Gems specializing in connecting to warehouse-style data sources.
Name | Description |
---|---|
BigQuery | Allows you to read or write data to the BigQuery warehouse, using a high-performance connector. Enterprise only. |
CosmosDB | Allows you to read or write data to the CosmosDB database. |
DB2 | Allows you to read or write data to the DB2 warehouse, using a high-performance connector. Enterprise only. |
JDBC | Allows you to read or write data to the JDBC database. |
MongoDB | Allows you to read or write data to the MongoDB database. |
Oracle | Allows you to read or write data to the Oracle warehouse, using a high-performance connector. Enterprise only. |
Redshift | Allows you to read or write data to the Redshift warehouse, using a high-performance connector. Enterprise only. |
Salesforce | Allows you to read or write data to the Salesforce warehouse. |
Snowflake | Allows you to read or write data to the Snowflake warehouse, using a high-performance connector. Enterprise only. |
Teradata | Allows you to read or write data to the Teradata warehouse, using a high-performance connector. Enterprise only. |
Catalog
A collection of Gems related to working with various table-based formats.
Name | Description |
---|---|
Delta | Reads data from Delta tables saved in data catalog and writes data into Delta table in a managed Metastore. |
Hive | Read from or write to Tables managed by a Hive metastore. |
Lookup
Lookup is a special component that allows you to broadcast any data, to later be used anywhere in your Pipeline.