Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/344.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
用于更快处理的Python生成器_Python_Mongodb_Generator_Pymongo - Fatal编程技术网

用于更快处理的Python生成器

用于更快处理的Python生成器,python,mongodb,generator,pymongo,Python,Mongodb,Generator,Pymongo,我有一个收藏student至少有1500万份文档。示例文档结构如下所示 { "_id": "ObjectId('5e8baxxxxxe400a')", "Email":"xxx1@gmail.com", "Favourites": ["Red","Blue","Green","Orange","Black"] "marks": [{"physics":23,"maths":20,"chemistry":19}] }, { "_id": "ObjectId(

我有一个收藏
student
至少有1500万份文档。示例文档结构如下所示

{
    "_id": "ObjectId('5e8baxxxxxe400a')",
    "Email":"xxx1@gmail.com",
    "Favourites": ["Red","Blue","Green","Orange","Black"]
    "marks": [{"physics":23,"maths":20,"chemistry":19}]
},
{
    "_id": "ObjectId('5e8baxxxxxe4002')",
    "Email":"xxx2@gmail.com",
    "Favourites": ["Purple","Pink","Magenta","White","Black"]
    "marks": [{"physics":22,"maths":25,"chemistry":19}]
},
{
    "_id": "ObjectId('5e8baxxxxxe4002')",
    "Email":"xxx3@gmail.com",
    "Favourites": ["Red","Yellow","Grey","White","Black"]
    "marks": [{"physics":12,"maths":14,"chemistry":19}]
},
{
    "_id": "ObjectId('5e8baxxxxxe4002')",
    "Email":"xxx4@gmail.com",
    "Favourites": ["Green", "White","Pink"]
    "marks": [{"physics":25,"maths":25,"chemistry":19}]
},
{
    "_id": "ObjectId('5e8baxxxxxe4002')",
    "Email":"xxx5@gmail.com",
    "Favourites": ["Green", "White","Black"]
    "marks": [{"physics":10,"maths":9,"chemistry":19}]
},
.....
....
我正在寻找一种方法来编写一个生成器函数来计算每个学生文档的总分

我的代码是这样的

def query():
    r = db.student.aggregate([
          {
          "$project": {
              "Email":1,
              "marks":1,
          }
       }
    ])
    for i in r:
        yield i



def find_total_score(doc):
    total = <code to calculate total by adding marks of physics chemistry and maths>
    doc['total'] = total   #write back the total score for sorting



it = iter(query())
while True:
    try:
        item = next(it)
        find_total_score(item)
    except StopIteration:
        break




def query():
r=db.student.aggregate([
{
“$project”:{
“电子邮件”:1,
"标记":一,,
}
}
])
对于r中的i:
产量一
def find_总分(doc):
总计=<通过添加物理、化学和数学分数计算总计的代码>
doc['total']=total#写回总分进行排序
it=iter(query())
尽管如此:
尝试:
项目=下一个(it)
查找总分数(项目)
除停止迭代外:
打破

在这里完成整个过程需要几分钟。使用生成器功能对我没有任何好处。我做错了什么

你期望什么样的优势(和什么相比)?我需要计算总分,并在不到10秒的时间内将它们全部排序