Python 3.x 需要更好的方法使用Gridfs将oracle blob数据加载到Mongodb集合中吗_Python 3.x_Mongodb_Oracle_Gridfs

Python 3.x 需要更好的方法使用Gridfs将oracle blob数据加载到Mongodb集合中吗

python-3.x mongodb oracle

Python 3.x 需要更好的方法使用Gridfs将oracle blob数据加载到Mongodb集合中吗,python-3.x,mongodb,oracle,gridfs,Python 3.x,Mongodb,Oracle,Gridfs,最近，我开始着手一个新项目，需要将oracle表数据传输到Mongodb集合中 Oracle表由一个BLOB数据类型列组成。我想使用GridFS将oracle表blob数据传输到Mongodb，我甚至成功了，但我无法扩展它如果我对10k或50k记录使用相同的脚本，则需要很长时间请建议我，是否有任何地方我可以改进或有更好的方式来实现我的目标先谢谢你请找出我用来加载少量数据的示例代码 from pymongo import MongoClient import cx_Oracle from

最近，我开始着手一个新项目，需要将oracle表数据传输到Mongodb集合中

Oracle表由一个BLOB数据类型列组成。我想使用GridFS将oracle表blob数据传输到Mongodb，我甚至成功了，但我无法扩展它

如果我对10k或50k记录使用相同的脚本，则需要很长时间

请建议我，是否有任何地方我可以改进或有更好的方式来实现我的目标

先谢谢你

请找出我用来加载少量数据的示例代码

from pymongo import MongoClient
import cx_Oracle
from  gridfs import GridFS
import pickle
import sys

client = MongoClient('localhost:27017/sample')
dbm = client.sample

db = <--oracle connection----->
cursor = db.cursor()

def get_notes_file_sys():
    return GridFS(dbm,'notes')

def save_data_in_file(fs,note,file_name):
    gridin = None
    file_ids = {}
    data_blob = pickle.dumps(note['file_content_blob'])
    del note['file_content_blob']

    gridin = fs.open_upload_stream(file_name, chunk_size_bytes=261120, metadata=note)
    gridin.write(data_blob)
    gridin.close()
    file_ids['note_id'] = gridin._id
    return file_ids

# ---------------------------Uploading files start---------------------------------------
fs = get_notes_file_sys()

query = ("""SELECT id, file_name, file_content_blob, author, created_at FROM notes fetch next 10 rows only""")
cursor.execute(query)
rows = cursor.fetchall()
col = [co[0] for co in cursor.description]
final_arr= []
for row in rows:
    data = dict(zip(col,row))
    file_name = data['file_name']
    if data["file_content_blob"] is None:
        data["file_content_blob"] = None
    else:
        # This below line is taking more time
        data["file_content_blob"] = data["file_content_blob"].read()     
    note_id =  save_data_in_file(fs,data,file_name)
    data['note_id'] = note_id
    final_arr.append(data)
dbm['notes'].bulk_insert(final_arr)

我想到两件事：

不要搬到Mongo。只需使用Oracle的SODA文档存储模型：还可以看看Oracle的JSON DB服务：

以字节形式获取blob，这比您正在使用的方法快得多。下面是一个示例