Skip to main content

RowDistributor

Spark Gem

Use the RowDistributor Gem to create multiple DataFrames based on provided filter conditions from an input DataFrame.

This is useful for cases where rows from the input DataFrame needs to be distributed into multiple DataFrames in different ways for downstream Gems.

Parameters

ParameterDescriptionRequired
DataFrameInput DataFrame for which rows needs to be distributed into multiple DataFramesTrue
Filter ConditionsBoolean Type column or boolean expression for each output tab. Supports SQL, Python and Scala expressionsTrue

Example

Row distributor 1

info

Number of outputs can be changed as needed by clicking the + button.

Generated Code

def RowDistributor(spark: SparkSession, in0: DataFrame) -> (DataFrame, DataFrame, DataFrame):
df1 = in0.filter((col("order_status") == lit("Started")))
df2 = in0.filter((col("order_status") == lit("Approved")))
df3 = in0.filter((col("order_status") == lit("Finished")))

return df1, df2, df3