Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/311.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 获取org.bson.BsonInvalidOperationException:打印pyspark数据帧时初始状态无效_Python_Mongodb_Apache Spark_Pyspark - Fatal编程技术网

Python 获取org.bson.BsonInvalidOperationException:打印pyspark数据帧时初始状态无效

Python 获取org.bson.BsonInvalidOperationException:打印pyspark数据帧时初始状态无效,python,mongodb,apache-spark,pyspark,Python,Mongodb,Apache Spark,Pyspark,我可以通过spark工作连接mongodb,但是当我试图查看从数据库加载的数据时,我在标题中提到了错误。我正在使用ApacheSpark的pyspark模块 代码片段是: from pyspark import SparkConf,SparkContext from pyspark.sql import SQLContext import sys print(sys.stdin.encoding, sys.stdout.encoding) conf=SparkConf() conf.set

我可以通过spark工作连接mongodb,但是当我试图查看从数据库加载的数据时,我在标题中提到了错误。我正在使用ApacheSpark的pyspark模块

代码片段是:

from pyspark import SparkConf,SparkContext
from pyspark.sql import SQLContext


import sys
print(sys.stdin.encoding, sys.stdout.encoding)

conf=SparkConf()
conf.set('spark.mongodb.input.uri','mongodb://127.0.0.1/github.users')
conf.set('spark.mongodb.output.uri','mongodb://127.0.0.1/github.users')


sc =SparkContext(conf=conf)
sqlContext =SQLContext(sc)
df = sqlContext.read.format("com.mongodb.spark.sql.DefaultSource").load()
df.printSchema()


df = df.sort('followers', ascending = True)


df.take(1)