Apache spark 得到_腐败的“U记录”;从json读取时在pyspark中
我在Pyspark shell中运行此代码Apache spark 得到_腐败的“U记录”;从json读取时在pyspark中,apache-spark,pyspark,apache-spark-sql,Apache Spark,Pyspark,Apache Spark Sql,我在Pyspark shell中运行此代码 from pyspark import SparkContext, SparkConf, SQLContext sc = SparkContext() sqlContext = SQLContext(sc) source = [{"attr_1": 1, "attr_2": "[{\"a\":1,\"b\":1},{\"a\":2,\"
from pyspark import SparkContext, SparkConf, SQLContext
sc = SparkContext()
sqlContext = SQLContext(sc)
source = [{"attr_1": 1, "attr_2": "[{\"a\":1,\"b\":1},{\"a\":2,\"b\":2}]"}, {"attr_1": 2, "attr_2": "[{\"a\":3,\"b\":3},{\"a\":4,\"b\":4}]"}]
df = sqlContext.read.json(sc.parallelize(source))
df.show()
但我得到的是:
+---------------+
|_corrupt_record|
+---------------+
| null|
| null|
+---------------+
我尝试添加多行参数并将其设置为真,但对我无效
有人对如何处理这件事有想法吗