Skip to main content

User-defined functions

Allows you to create user defined functions (UDF) which are then usable anywhere in the pipeline


UDF NameName of the udf to be used to register it. All calls to the udf will use this nameTrue
DefinitionDefinition of the UDF function.
Eg: udf((value:Int)=>value*value)
UDF initialization codeCode block that contains initialization of entities used by UDFs. This could for example contain any static mapping that a UDF might useFalse


Defining and Using UDF

Step 1 - Open UDF definition window

country_code_map = {"Mexico" : "MX", "USA" : "US", "India" : "IN"}

def registerUDFs(spark: SparkSession):
spark.udf.register("get_country_code", get_country_code)

@udf(returnType = StringType())
def get_country_code(country: str):
return country_code_map.get(country, "Not Found")