Python 展开并行数组并将响应格式化为键值对
我是mongodb和ipython的新手。 我的数据集如下所示:Python 展开并行数组并将响应格式化为键值对,python,mongodb,Python,Mongodb,我是mongodb和ipython的新手。 我的数据集如下所示: book1 = { "author" :"A A", "book" : { "series" : "19 A, 19 B, 19 C", "year" : "1990, 1991, 1992" }} book2 = { "author" :"B B", "book" : { "series" : "20 A, 20 B, 19 C", "year" : "1995, 1995, 1992" }
book1 = {
"author" :"A A",
"book" : {
"series" : "19 A, 19 B, 19 C",
"year" : "1990, 1991, 1992"
}}
book2 = {
"author" :"B B",
"book" : {
"series" : "20 A, 20 B, 19 C",
"year" : "1995, 1995, 1992"
} }
book3 = {
"author" :"C C",
"book" : {
"series" : "19 A, 19 B, 19 C",
"year" : "1990, 1991, 1992"
} }
这些数据被插入mongodb。
我想拆分系列和年份,因为系列的第一列是在年份的第一列中发布的(可能术语“列”不适用于此数据,因为系列和年份不是数组,而是文本):
我想它打印文件如上所示。这个系列是独一无二的
到目前为止,我所做的就像下面的代码。想法是将文本(系列和年份)拆分,然后将它们展开。但是我不知道如何创建如上所示的列表。但这段代码返回错误,我不知道如何解决它
project = {"$project": {"series_list" : {"$split" : ["book.series", ", "]},
{"year_list" : {"$split" : ["book.year", ", "]} }}
}
unwind = {"$unwind" : "$series_list", "$year_list" }
group = {"$group" : {"_id": {"series": "$series_list"}}, "year":"$year_list"}
cur = db.collection.aggregate([project, unwind, group])
您可以在3.4 mongo版本中尝试以下聚合 其思想是将序列和年份数组与一起创建一个文档数组,其中序列和年份键值对后跟&以创建唯一的组合 将id提升到顶级
db.collection_name.aggregate([
{
"$project": {
"series_and_year_list": {
"$map": {
"input": {
"$zip": {
"inputs": [
{
"$split": [
"$book.series",
", "
]
},
{
"$split": [
"$book.year",
", "
]
}
]
}
},
"as": "zipped",
"in": {
"series": {
"$arrayElemAt": [
"$$zipped",
0
]
},
"year": {
"$arrayElemAt": [
"$$zipped",
1
]
}
}
}
}
}
},
{
"$unwind": "$series_and_year_list"
},
{
"$group": {
"_id": {
"series": "$series_and_year_list.series",
"year": "$series_and_year_list.year"
}
}
},
{
"$replaceRoot": {
"newRoot": "$_id"
}
}
])
您可以尝试以下方法:
book1 = {
"author" :"A A",
"book" : {
"series" : "19 A, 19 B, 19 C",
"year" : "1990, 1991, 1992"
}}
book2 = {
"author" :"B B",
"book" : {
"series" : "20 A, 20 B, 19 C",
"year" : "1995, 1995, 1992"
} }
book3 = {
"author" :"C C",
"book" : {
"series" : "19 A, 19 B, 19 C",
"year" : "1990, 1991, 1992"
} }
book_list=[book1,book2,book3]
for i in book_list:
series_book = []
b_list={}
for key,value in i['book'].items():
series_book.append([kk.strip() for kk in value.split(',')])
for i in range(0,len(series_book),2):
zipped_stuff=list(zip(*series_book[i:i+2]))
for i in zipped_stuff:
b_list["year"] = i[1]
b_list["_id"]={'series': i[0]}
print(b_list)
输出:
{'_id': {'series': '19 A'}, 'year': '1990'}
{'_id': {'series': '19 B'}, 'year': '1991'}
{'_id': {'series': '19 C'}, 'year': '1992'}
{'_id': {'series': '20 A'}, 'year': '1995'}
{'_id': {'series': '20 B'}, 'year': '1995'}
{'_id': {'series': '19 C'}, 'year': '1992'}
{'_id': {'series': '19 A'}, 'year': '1990'}
{'_id': {'series': '19 B'}, 'year': '1991'}
{'_id': {'series': '19 C'}, 'year': '1992'}
嗨,阿约迪亚基特保罗。谢谢你试着帮助我。我很感激。但是数据被插入了,我想把它们聚合起来,以便像你制作的列表一样打印出来。
{'_id': {'series': '19 A'}, 'year': '1990'}
{'_id': {'series': '19 B'}, 'year': '1991'}
{'_id': {'series': '19 C'}, 'year': '1992'}
{'_id': {'series': '20 A'}, 'year': '1995'}
{'_id': {'series': '20 B'}, 'year': '1995'}
{'_id': {'series': '19 C'}, 'year': '1992'}
{'_id': {'series': '19 A'}, 'year': '1990'}
{'_id': {'series': '19 B'}, 'year': '1991'}
{'_id': {'series': '19 C'}, 'year': '1992'}