Skip to main content

Reformat

Edits one or more column names or values, by using expressions and functions. It's useful when we need to extract only the required columns or make changes column-wise.

Parameters

ParameterDescriptionRequired
DataFrameInput DataFrame on which changes are requiredTrue
Target columnOutput column nameFalse
ExpressionExpression to compute target columnRequired if a Target column is present
info

If no columns are selected, then all columns are passed through to the output

Example

Example usage of Reformat

Spark Code

Reformat converts to a SQL Select or in relational terms into a projection, unlike SchemaTransform Gem which uses underlying withColumn construct

def Reformat(spark: SparkSession, in0: DataFrame) -> DataFrame:
return in0.select(
col("id"),
col("email").alias("email_address"),
col("name"),
col("updated_at"),
concat_ws("$$$", col("address_line1"), col("address_line2"), col("postal_code"))
.alias("address_string")
)