SetOperation
Spark Gem
Use the SetOperation Gem to perform addition or subtraction of rows from DataFrames with identical schemas and different data.
Parameters
Parameter | Description | Required |
---|---|---|
DataFrame 1 | First input DataFrame | True |
DataFrame 2 | Second input DataFrame | True |
DataFrame N | Nth input DataFrame | False |
Operation type | Operation to perform - Union : Returns a DataFrame containing rows in any one of the input DataFrames, while preserving duplicates.- Intersect All : Returns a DataFrame containing rows in all of the input DataFrames, while preserving duplicates. - Except All : Returns a DataFrames containing rows in the first DataFrame, but not in the other DataFrames, while preserving duplicates. | True |
info
To add more input DataFrames, simply click +
icon on the left sidebar
Examples
Operation Type - Union
- Python
- Scala
def union(spark: SparkSession, in0: DataFrame, in1: DataFrame, ) -> DataFrame:
return in0.unionAll(in1)
object union {
def apply(spark: SparkSession, in0: DataFrame, in1: DataFrame): DataFrame =
in0.unionAll(in1)
}
Operation Type - Intersect All
- Python
- Scala
def intersectAll(spark: SparkSession, in0: DataFrame, in1: DataFrame, ) -> DataFrame:
return in0.intersectAll(in1)
object intersectAll {
def apply(spark: SparkSession, in0: DataFrame, in1: DataFrame): DataFrame =
in0.intersectAll(in1)
}
Operation Type - Except All
- Python
- Scala
def exceptAll(spark: SparkSession, in0: DataFrame, in1: DataFrame, ) -> DataFrame:
return in0.exceptAll(in1)
object exceptAll {
def apply(spark: SparkSession, in0: DataFrame, in1: DataFrame): DataFrame =
in0.exceptAll(in1)
}