Skip to main content

SQL Statement

Create one or more DataFrame(s) based on provided SQL queries to run against one or more input DataFrame(s).

Parameters

ParameterMeaningRequired
DataFrame(s)Input DataFrame(s)True
SQL QueriesSQL Query for each output tabTrue

Example

SQL example 1

info

Number of inputs and outputs can be changed as needed by clicking the + button on the respective tab.

Generated Code

def SQLStatement(spark: SparkSession, orders: DataFrame, customers: DataFrame) -> (DataFrame, DataFrame):
orders.createOrReplaceTempView("orders")
customers.createOrReplaceTempView("customers")
df1 = spark.sql("select * from orders inner join customers on orders.customer_id = customers.customer_id")
df2 = spark.sql("select distinct customer_id from orders")

return df1, df2