Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/mongodb/11.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何使用pyspark在mongodb中保存数据帧?_Mongodb_Dataframe_Pyspark - Fatal编程技术网

如何使用pyspark在mongodb中保存数据帧?

如何使用pyspark在mongodb中保存数据帧?,mongodb,dataframe,pyspark,Mongodb,Dataframe,Pyspark,这是我的代码,不知道哪里有什么错误,请帮忙 my_spark = SparkSession \ .builder \ .appName("DGLE") \ config("spark.mongodb.input.uri", "mongodb://127.0.0.1/local.DGLE") \ getOrCreate() JDBC.write.format("com.mongodb.spark.sql.DefaultSource").option("spark.mongodb.output

这是我的代码,不知道哪里有什么错误,请帮忙

my_spark = SparkSession \ .builder \ .appName("DGLE") \ 
 config("spark.mongodb.input.uri", "mongodb://127.0.0.1/local.DGLE") \ getOrCreate() 
JDBC.write.format("com.mongodb.spark.sql.DefaultSource").option("spark.mongodb.output.uri", "mongodb://127.0.0.1:27017/local.DGLE").save() 


从pyspark写入mongodb的有效方法是使用。连接器将数据转换为BSON格式并保存到mongodb。假设您有一个名为df的spark数据帧,希望将其保存在mongodb中。您可以尝试:

my_spark = SparkSession \ builder \ .appName("DGLE") \ .config("spark.mongodb.input.uri", "mongodb://127.0.0.1/local.DGLE") \ .getOrCreate() 
JDBC.write.format("com.mongodb.spark.sql.DefaultSource").option("spark.mongodb.output.uri", "mongodb://127.0.0.1:27017/local.DGLE").save()
如果你正在使用笔记本电脑,请在顶部写下-

from pyspark.sql import SparkSession, SQLContext
from pyspark import SparkConf, SparkContext
sc = SparkContext()
spark = SparkSession(sc)   


df.write.format("com.mongodb.spark.sql.DefaultSource").mode("append").option("spark.mongodb.output.uri","mongodb://username:password@server_details:27017/db_name.collection_name?authSource=admin").save()
如果您使用的是spark submit命令:

%%configure
{"conf": {"spark.jars.packages": "org.mongodb.spark:mongo-spark-connector_2.11:2.3.2"}}
这就是SparkSession的样子

def save(message: DataFrame):
    message.write \
        .format("mongo") \
        .mode("append") \
        .option("database", "db_name") \
        .option("collection", "collection_name") \
        .save()
    pass

你能再解释一下你的问题吗。不仅仅是代码,还请输入您的代码。我试图编辑,但我真的不知道它到底应该是怎样的。请编辑您的问题,除非我必须标记它
def save(message: DataFrame):
    message.write \
        .format("mongo") \
        .mode("append") \
        .option("database", "db_name") \
        .option("collection", "collection_name") \
        .save()
    pass
spark: SparkSession = SparkSession \
    .builder \
    .appName("MyApp") \
    .config("spark.mongodb.input.uri", "mongodb://localhost:27017") \
    .config("spark.mongodb.output.uri", "mongodb://localhost:27017") \
    .config("spark.jars.packages", "org.mongodb.spark:mongo-spark-connector_2.12:3.0.1") \
    .master("local") \
    .getOrCreate()