JSON

Read and write JSON formatted files

Source

Source Parameters

JSON Source supports all the available Spark read options for JSON.

The below list contains the additional parameters to read a JSON file:

Parameter	Description	Required
Dataset Name	Name of the Dataset	True
Location	Location of the file(s) to be loaded Eg: `dbfs:/data/test.json`	True
Schema	Schema to applied on the loaded data. Can be defined/edited as JSON or inferred using `Infer Schema` button	True

Example

Generated Code

def ReadDelta(spark: SparkSession) -> DataFrame:
    return spark.read.format("json").load("dbfs:/FileStore/data/example.json")

object ReadJson {

def apply(spark: SparkSession): DataFrame =
spark.read
.format("json")
.load("dbfs:/FileStore/data/example.json")

}

Target

Target Parameters

JSON Target supports all the available Spark write options for JSON.

The below list contains the additional parameters to write a JSON file:

Parameter	Description	Required
Dataset Name	Name of the Dataset	True
Location	Location of the file(s) to be loaded Eg: `dbfs:/data/output.json`	True

Example

Generated Code

def write_json(spark: SparkSession, in0: DataFrame):
    in0.write\
        .format("json")\
        .mode("overwrite")\
        .save("dbfs:/data/test_output.json")

object write_json {
  def apply(spark: SparkSession, in: DataFrame): Unit =
    in.write
        .format("json")
        .mode("overwrite")
        .save("dbfs:/data/test_output.json")
}

Producing a single output file

Because of Spark's distributed nature, output files are written as multiple separate partition files. If you need a single output file for some reason (such as reporting or exporting to an external system), use a Repartition Gem in Coalesce mode with 1 output partition:

caution

Note: This is not recommended for extremely large data sets as it may overwhelm the worker node writing the file.

Source​

Source Parameters​

Example​

Generated Code​

Target​

Target Parameters​

Example​

Generated Code​

Producing a single output file​

Source

Source Parameters

Example

Generated Code

Target

Target Parameters

Example

Generated Code

Producing a single output file