Python 获取org.bson.BsonInvalidOperationException:打印pyspark数据帧时初始状态无效
我可以通过spark工作连接mongodb,但是当我试图查看从数据库加载的数据时,我在标题中提到了错误。我正在使用ApacheSpark的pyspark模块 代码片段是:Python 获取org.bson.BsonInvalidOperationException:打印pyspark数据帧时初始状态无效,python,mongodb,apache-spark,pyspark,Python,Mongodb,Apache Spark,Pyspark,我可以通过spark工作连接mongodb,但是当我试图查看从数据库加载的数据时,我在标题中提到了错误。我正在使用ApacheSpark的pyspark模块 代码片段是: from pyspark import SparkConf,SparkContext from pyspark.sql import SQLContext import sys print(sys.stdin.encoding, sys.stdout.encoding) conf=SparkConf() conf.set
from pyspark import SparkConf,SparkContext
from pyspark.sql import SQLContext
import sys
print(sys.stdin.encoding, sys.stdout.encoding)
conf=SparkConf()
conf.set('spark.mongodb.input.uri','mongodb://127.0.0.1/github.users')
conf.set('spark.mongodb.output.uri','mongodb://127.0.0.1/github.users')
sc =SparkContext(conf=conf)
sqlContext =SQLContext(sc)
df = sqlContext.read.format("com.mongodb.spark.sql.DefaultSource").load()
df.printSchema()
df = df.sort('followers', ascending = True)
df.take(1)