Apache spark Spark:如何将成对rdd保存到json文件?
我的Rdd是这样的:Apache spark Spark:如何将成对rdd保存到json文件?,apache-spark,pyspark,Apache Spark,Pyspark,我的Rdd是这样的: [('f1',1), ('f2',2)] 如何将其保存到json文件?您可以将rdd转换为数据帧并写入json from pyspark.sql import SparkSession spark = SparkSession.builder \ .appName('SO')\ .getOrCreate() sc= spark.sparkContext df = sc.parallelize(
[('f1',1), ('f2',2)]
如何将其保存到json文件?您可以将rdd转换为数据帧并写入json
from pyspark.sql import SparkSession
spark = SparkSession.builder \
.appName('SO')\
.getOrCreate()
sc= spark.sparkContext
df = sc.parallelize(
[('f1', 1), ('f2', 2)]).toDF(["key", "value"])
df.write.format('json').save('output_path')
json文件中的输出如下所示
{"key":"f1","value":1}
{"key":"f2","value":2}
您可以将rdd转换为dataframe并写入JSON
from pyspark.sql import SparkSession
spark = SparkSession.builder \
.appName('SO')\
.getOrCreate()
sc= spark.sparkContext
df = sc.parallelize(
[('f1', 1), ('f2', 2)]).toDF(["key", "value"])
df.write.format('json').save('output_path')
json文件中的输出如下所示
{"key":"f1","value":1}
{"key":"f2","value":2}
您期望的json格式是什么?您期望的json格式是什么?