Hive Table

Reads from and writes to Hive tables that your execution environment's Metadata catalog manages.

Prerequisites

Before you specify parameters and properties, select the Hive table type:

Open the Source or Target gem configuration.
On the Type & Format page, select Catalog Table.
On the Properties page, set the provider property to hive.

Parameters

Parameter	Tab	Description
Use Unity Catalog	Location	Whether to use a Unity catalog.
Catalog	Location	If you use a unity catalog, specify which catalog to use.
Database	Location	Name of the database to connect to
Table	Location	Name of the table to connect to.
Use file path	Location	Whether to use a custom file path to store underlying files in the Target gem.
Schema	Properties	Schema to apply on the loaded data. In the Source gem, you can define or edit the schema visually or in JSON code. In the Target gem, you can view the schema visually or as JSON code.

Source

The Source gem reads data from Hive tables and allows you to optionally specify the following additional properties.

Source properties

Properties	Description	Default
Description	Description of your dataset.	None
Provider	Provider to use. You must set this to `hive`.	`delta`
Filter Predicate	Where clause to filter the table by.	(all records)

Source example

Compiled code

tip

To see the compiled code of your project, switch to the Code view in the project header.

Without filter predicate

Python
Scala

def Source(spark: SparkSession) -> DataFrame:
    return spark.read.table(f"test_db.test_table")

object Source {

  def apply(spark: SparkSession): DataFrame = {
    spark.read.table("test_db.test_table")
  }

}

With filter predicate

Python
Scala

def Source(spark: SparkSession) -> DataFrame:
    return spark.sql("SELECT * FROM test_db.test_table WHERE col > 10")

object Source {

  def apply(spark: SparkSession): DataFrame =
    spark.sql("SELECT * FROM test_db.test_table WHERE col > 10")

}

Target

The Target gem writes data to Delta tables and allows you to optionally specify the following additional properties.

Target properties

Property	Description	Default
Description	Description of your dataset.	None
Provider	Provider to use. You must set this to `hive`.	`delta`
Write Mode	How to handle existing data. For a list of the possible values, see Supported write modes.	`error`
File Format	File format to use when saving data. Supported file formats are: `sequencefile`, `rcfile`, `orc`, `parquet`, `textfile`, and `avro`.	`parquet`
Partition Columns	List of columns to partition the Hive table table by.	None
Use insert into	Whether to use the `insertInto()` method to write instead of the `save()` method.	false

Supported write modes

Write mode	Description
overwrite	If the data already exists, overwrite the data with the contents of the `DataFrame`.
error	If the data already exists, throw an exception.
append	If the data already exists, append the contents of the `DataFrame`.
ignore	If the data already exists, do nothing with the contents of the `DataFrame`. This is similar to the `CREATE TABLE IF NOT EXISTS` clause in SQL.

Target example

Compiled code

tip

To see the compiled code of your project, switch to the Code view in the project header.

Python
Scala

def Target(spark: SparkSession, in0: DataFrame):
    in0.write\
        .format("hive")\
        .option("fileFormat", "parquet")\
        .mode("overwrite")\
        .saveAsTable("test_db.test_table")

object Target {

  def apply(spark: SparkSession, in: DataFrame): DataFrame = {
    in.write
        .format("hive")
        .option("fileFormat", "parquet")
        .mode("overwrite")
        .saveAsTable("test_db.test_table")
  }

}

Prerequisites​

Parameters​

Source​

Source properties​

Source example​

Compiled code​

Without filter predicate​

With filter predicate​

Target​

Target properties​

Supported write modes​

Target example​

Compiled code​

Prerequisites

Parameters

Source

Source properties

Source example

Compiled code

Without filter predicate

With filter predicate

Target

Target properties

Supported write modes

Target example

Compiled code