Skip to main content

Source & Target

Constitutes the set of Gems that help with loading and saving data.

File

A collection of Gems related to working with various file-based formats.

NameDescription
CSVAllows you to read or write a delimited file (often called Comma Separated File, CSV)
ParquetParquet is an open source file format built to handle flat columnar storage data formats.
AvroAvro format is a row-based storage format for Hadoop, which is widely used as a serialization platform.
TextThis Gem allows you to read from or write to text file.
DeltaReads data from Delta files present at a path and writes Delta files to a path based on configuration.
JSONAllows you to read or write a delimited file (often called Comma Separated File, CSV)
ORCORC (Optimized Row Columnar) is a columnar file format designed for Spark/Hadoop workloads.
Fixed FormatRead data from fixed format files with expected schema, or write data to fixed format files with expected schema.
KafkaThis source currently connects with Kafka Brokers in Batch mode.
XLSX (Excel)Allows you to read or write Excel-compatible files.
FTPAllows you to read or write files (csv, text and binary) on a remote location

Warehouse

A collection of Gems specializing in connecting to warehouse-style data sources.

NameDescription
SnowflakeAllows you to read or write data to the Snowflake warehouse, using a high-performance connector. Enterprise only.
RedshiftAllows you to read or write data to the Redshift warehouse, using a high-performance connector. Enterprise only.
TeradataAllows you to read or write data to the Teradata warehouse, using a high-performance connector. Enterprise only.
JDBCAllows you to read or write data to the JDBC database.
OracleAllows you to read or write data to the Oracle warehouse, using a high-performance connector. Enterprise only.
DB2Allows you to read or write data to the DB2 warehouse, using a high-performance connector. Enterprise only.
BigQueryAllows you to read or write data to the BigQuery warehouse, using a high-performance connector. Enterprise only.

Catalog

A collection of Gems related to working with various table-based formats.

NameDescription
HiveRead from or write to Tables managed by a Hive metastore
DeltaReads data from Delta tables saved in data catalog and writes data into Delta table in a managed Metastore.

Lookup

Lookup is a special component that allows you to broadcast any data, to later be used anywhere in your Pipeline.