Skip to main content

SetOperation

Spark Gem

Use the SetOperation Gem to perform addition or subtraction of rows from DataFrames with identical schemas and different data.

Parameters

ParameterDescriptionRequired
DataFrame 1First input DataFrameTrue
DataFrame 2Second input DataFrameTrue
DataFrame NNth input DataFrameFalse
Operation typeOperation to perform
- Union: Returns a DataFrame containing rows in any one of the input DataFrames, while preserving duplicates.
- Intersect All: Returns a DataFrame containing rows in all of the input DataFrames, while preserving duplicates.
- Except All: Returns a DataFrames containing rows in the first DataFrame, but not in the other DataFrames, while preserving duplicates.
True
info

To add more input DataFrames, simply click + icon on the left sidebar Set Operation - Add input dataframe

Examples


Operation Type - Union

Example usage of Set Operation - Union

def union(spark: SparkSession, in0: DataFrame, in1: DataFrame, ) -> DataFrame:
return in0.unionAll(in1)

Operation Type - Intersect All

Example usage of Set Operation - Intersect All

def intersectAll(spark: SparkSession, in0: DataFrame, in1: DataFrame, ) -> DataFrame:
return in0.intersectAll(in1)

Operation Type - Except All

Example usage of Set Operation - Except All

def exceptAll(spark: SparkSession, in0: DataFrame, in1: DataFrame, ) -> DataFrame:
return in0.exceptAll(in1)