Skip to main content

SQLStatement

Spark Gem

Create one or more DataFrame(s) based on provided SQL queries to run against one or more input DataFrames.

Parameters

ParameterMeaningRequired
DataFrame(s)Input DataFrame(s)True
SQL QueriesSQL Query for each output tabTrue

Example

SQL example 1

info

Number of inputs and outputs can be changed as needed by clicking the + button on the respective tab.

Generated Code

def SQLStatement(spark: SparkSession, orders: DataFrame, customers: DataFrame) -> (DataFrame, DataFrame):
orders.createOrReplaceTempView("orders")
customers.createOrReplaceTempView("customers")
df1 = spark.sql("select * from orders inner join customers on orders.customer_id = customers.customer_id")
df2 = spark.sql("select distinct customer_id from orders")

return df1, df2