Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/meteor/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何在PySpark中将df列[JSON_Format]转换为多个列?_Python_Apache Spark_Pyspark_Apache Kafka_Spark Structured Streaming - Fatal编程技术网

Python 如何在PySpark中将df列[JSON_Format]转换为多个列?

Python 如何在PySpark中将df列[JSON_Format]转换为多个列?,python,apache-spark,pyspark,apache-kafka,spark-structured-streaming,Python,Apache Spark,Pyspark,Apache Kafka,Spark Structured Streaming,我从Kafka获得了JSON格式的数据,并将数据作为PySpark中的数据帧读取 我从卡夫卡那里得到数据后,它显示为数据帧格式: DataFrame[value: string] 但是,该值包含JSON/DICT格式 打印声明并返回: def print_row(row): print(row) pass testing.writeStream.foreach(print_row).start() 如何将值(JSON)转换为数据帧列,如: col_1 timestamp

我从Kafka获得了JSON格式的数据,并将数据作为PySpark中的数据帧读取

我从卡夫卡那里得到数据后,它显示为数据帧格式:

DataFrame[value: string]
但是,该值包含JSON/DICT格式

打印声明并返回:

def print_row(row):
    print(row)
    pass

testing.writeStream.foreach(print_row).start()
如何将值(JSON)转换为数据帧列,如:

col_1  timestamp
80.0   2020-01-13T08:58:58.164Z

可以为JSON数据集创建一个数据帧,该数据集由RDD[String]表示,每个字符串存储一个JSON对象

jsonStrings = ['{"name":"Yin","address":{"city":"Columbus","state":"Ohio"}}']
otherPeopleRDD = sc.parallelize(jsonStrings)
otherPeople = spark.read.json(otherPeopleRDD)
otherPeople.show()

定义一个模式并解析JSON

抄袭


因为我通过下面的链接阅读了卡夫卡的数据。它返回的是数据帧格式,而不是JSON字符串。
jsonStrings = ['{"name":"Yin","address":{"city":"Columbus","state":"Ohio"}}']
otherPeopleRDD = sc.parallelize(jsonStrings)
otherPeople = spark.read.json(otherPeopleRDD)
otherPeople.show()
# value schema: { "a": 1, "b": "string" }
schema = StructType().add("a", IntegerType()).add("b", StringType())
df.select( \
  col("key").cast("string"),
  from_json(col("value").cast("string"), schema))