MongoDB在聚合管道中使用$addToSet避免重复
有一个聚合管道:MongoDB在聚合管道中使用$addToSet避免重复,mongodb,mongodb-query,aggregation-framework,mongodb-aggregation,Mongodb,Mongodb Query,Aggregation Framework,Mongodb Aggregation,有一个聚合管道: db.getCollection('yourCollection').aggregate( { $unwind: { path: "$dates", includeArrayIndex: "idx" } }, { $project: { _id: 0, dates: 1, numbers:
db.getCollection('yourCollection').aggregate(
{
$unwind: {
path: "$dates",
includeArrayIndex: "idx"
}
},
{
$project: {
_id: 0,
dates: 1,
numbers: { $arrayElemAt: ["$numbers", "$idx"] },
goals: { $arrayElemAt: ["$goals", "$idx"] },
durations: { $arrayElemAt: ["$durations", "$idx"] }
}
}
)
对以下数据(样本文档)执行的操作:
查询工作正常,但存在重复记录,因此我尝试使用$addToSet
运算符来避免重复:
db.getCollection('yourCollection').aggregate(
{
$match: {
"number": number
}
},
{
$unwind: {
path: "$dates",
includeArrayIndex: "idx"
}
},
$group: {
_id: '$_id',
dates: { $addToSet: '$dates' }
},
{
$project: {
_id: 0,
dates: 1,
numbers: { $arrayElemAt: ["$numbers", "$idx"] },
goals: { $arrayElemAt: ["$goals", "$idx"] },
durations: { $arrayElemAt: ["$durations", "$idx"] }
}
}
)
但我只得到了日期(其他字段为null
)
有人知道问题出在哪里吗?您需要使用操作符在管道中包含字段,如下所示:
db.getCollection('yourCollection').aggregate([
{ "$unwind": "$dates" },
{
"$group": {
"_id": "$_id",
"dates": { "$addToSet": "$dates" },
"numbers": { "$first": "$numbers" },
"goals": { "$first": "$goals" },
"durations": { "$first": "$durations" }
}
},
{ "$unwind": {
"path": "$dates",
"includeArrayIndex": "idx"
} },
{
"$project": {
"_id": 0,
"dates": 1,
"numbers": { "$arrayElemAt": ["$numbers", "$idx"] },
"goals": { "$arrayElemAt": ["$goals", "$idx"] },
"durations": { "$arrayElemAt": ["$durations", "$idx"] }
}
}
])
或使用消除重复项,如下所示:
db.getCollection('yourCollection').aggregate([
{
"$project": {
"_id": 0,
"dates": { "$setUnion": ["$dates", "$dates"] },
"numbers": 1,
"goals": 1,
"durations": 1
}
}
{ "$unwind": {
"path": "$dates",
"includeArrayIndex": "idx"
} },
{
"$project": {
"_id": 0,
"dates": 1,
"dateIndex": "$idx",
"numbers": { "$arrayElemAt": ["$numbers", "$idx"] },
"goals": { "$arrayElemAt": ["$goals", "$idx"] },
"durations": { "$arrayElemAt": ["$durations", "$idx"] }
}
}
])
在处理$group时,基本上排除了所有其他变量。在那一点之后,你不能再把它们重新投射回去。如果您只想从数组中删除重复项,那么最好是在javascript/客户机代码中这样做,或者使用map reduce。请参见此处:您还可以修改$group pipeline阶段以在其中添加其他字段(请参见chridam的答案)。谢谢,我尝试了两种解决方案,但仍然存在重复项:/能否使用生成重复项的示例文档更新您的问题,以及显示这些文档的预期输出?请看一看问题,该问题是否已经解决?您能用生成副本的示例文档和您的预期输出更新您的问题,以便我进行测试和确认吗?是的,但类似。另外,在$unwind操作符之前,我使用$match。这会是个问题吗?请看第二个示例文档。我有
{日期:'1399027545000',数字:'0982',目标:'2',持续时间:92},{日期:'1399101432000',数字:'0982',目标:'2',持续时间:92},{日期:'1399026850000',数字:'0982',目标:'2',持续时间:92},{dates:'1399027545000',number:'0982',goals:'2',durations:92}
。如您所见,最后一个文档与第一个文档重复。
db.getCollection('yourCollection').aggregate([
{ "$unwind": "$dates" },
{
"$group": {
"_id": "$_id",
"dates": { "$addToSet": "$dates" },
"numbers": { "$first": "$numbers" },
"goals": { "$first": "$goals" },
"durations": { "$first": "$durations" }
}
},
{ "$unwind": {
"path": "$dates",
"includeArrayIndex": "idx"
} },
{
"$project": {
"_id": 0,
"dates": 1,
"numbers": { "$arrayElemAt": ["$numbers", "$idx"] },
"goals": { "$arrayElemAt": ["$goals", "$idx"] },
"durations": { "$arrayElemAt": ["$durations", "$idx"] }
}
}
])
db.getCollection('yourCollection').aggregate([
{
"$project": {
"_id": 0,
"dates": { "$setUnion": ["$dates", "$dates"] },
"numbers": 1,
"goals": 1,
"durations": 1
}
}
{ "$unwind": {
"path": "$dates",
"includeArrayIndex": "idx"
} },
{
"$project": {
"_id": 0,
"dates": 1,
"dateIndex": "$idx",
"numbers": { "$arrayElemAt": ["$numbers", "$idx"] },
"goals": { "$arrayElemAt": ["$goals", "$idx"] },
"durations": { "$arrayElemAt": ["$durations", "$idx"] }
}
}
])