mongoDB聚合:$addToSet然后$sort
我试图对mongoDB集合(使用nodeJS驱动程序)中多个字段中的数组中的唯一值进行排序 小数据集:mongoDB聚合:$addToSet然后$sort,mongodb,aggregation-framework,Mongodb,Aggregation Framework,我试图对mongoDB集合(使用nodeJS驱动程序)中多个字段中的数组中的唯一值进行排序 小数据集: [{ "_id" : "5c93db3dd0184516406013f7", "filters" : { "genres" : [ { "_id" : "9CXBYc4qP8sqcNMZ5", "fr" : "Art Abstrait", "
[{
"_id" : "5c93db3dd0184516406013f7",
"filters" : {
"genres" : [
{
"_id" : "9CXBYc4qP8sqcNMZ5",
"fr" : "Art Abstrait",
"en" : "Abstract Art",
"de" : "Abstrakte Kunst",
"it" : "Arte astratta",
"es" : "Arte Abstracto"
}
],
"subjects" : [
{
"_id" : "3QjL6YSfmuY6NFHGG",
"fr" : "Abstrait",
"en" : "Abstract",
"de" : "Abstrakt",
"it" : "Astratto",
"es" : "Abstracto"
}
],
"type" : {
"_id" : "CYK2WcepkJsy5xXMo",
"fr" : "Gravure au carborundum",
"en" : "Carborundum etching",
"de" : "Carborundum Radierung",
"it" : "Incisione carborandum",
"es" : "Grabado al Carborundum"
}
}
},
{
"_id" : "5c93db3ed0184516406013f8",
"filters" : {
"genres" : [
{
"_id" : "9CXBYc4qP8sqcNMZ5",
"fr" : "Art Abstrait",
"en" : "Abstract Art",
"de" : "Abstrakte Kunst",
"it" : "Arte astratta",
"es" : "Arte Abstracto"
}
],
"subjects" : [
{
"_id" : "3QjL6YSfmuY6NFHGG",
"fr" : "Abstrait",
"en" : "Abstract",
"de" : "Abstrakt",
"it" : "Astratto",
"es" : "Abstracto"
}
],
"type" : {
"_id" : "CYK2WcepkJsy5xXMo",
"fr" : "Gravure au carborundum",
"en" : "Carborundum etching",
"de" : "Carborundum Radierung",
"it" : "Incisione carborandum",
"es" : "Grabado al Carborundum"
}
}
},
{
"_id" : "5c93e19ed018451640601da6",
"filters" : {
"genres" : [
{
"_id" : "9CXBYc4qP8sqcNMZ5",
"fr" : "Art Abstrait",
"en" : "Abstract Art",
"de" : "Abstrakte Kunst",
"it" : "Arte astratta",
"es" : "Arte Abstracto"
}
],
"subjects" : [
{
"_id" : "3QjL6YSfmuY6NFHGG",
"fr" : "Abstrait",
"en" : "Abstract",
"de" : "Abstrakt",
"it" : "Astratto",
"es" : "Abstracto"
}
],
"type" : {
"_id" : "KfGWEHL2pAto8nfze",
"fr" : "Gravure",
"en" : "Etching",
"de" : "Radierung",
"it" : "Incisione",
"es" : "Grabado"
}
}
}]
我的查询结果(使用lang='en'
):
聚合的管道:
[
{ $unwind: '$filters.subjects' },
{ $unwind: '$filters.genres' },
{ $group: {
_id: null,
subjects: { $addToSet: '$filters.subjects' },
types: { $addToSet: '$filters.type' },
genres: { $addToSet: '$filters.genres' },
}},
{ $unwind: '$subjects' },
{ $unwind: '$genres' },
{ $unwind: '$types' },
{ $sort: {
[`subjects.${lang}`]: 1,
[`types.${lang}`]: 1,
[`genres.${lang}`]: 1,
}},
{ $group: {
_id: null,
subjects: { $push: '$subjects' },
types: { $push: '$types' },
genres: { $push: '$genres' },
}},
{ $project: {
_id: false,
subjects: '$subjects',
types: '$types',
genres: '$genres'
}}
]
而不是按如下方式获取唯一值的排序数组:
[A,B,C,D,…]
我得到具有非唯一值的排序数组,如下所示:
[A,A,A,B,B,C,C,D,D,…]
使$addToSet
分组无效
知道我弄错了什么吗?您遇到的问题是,每个
$unwind
都将使用要展开的数组中的单个数组元素创建文档副本。您具有以下功能:
...
{ $unwind: '$subjects' },
{ $unwind: '$genres' },
{ $unwind: '$types' },
...
因此,首先您要展开主题
,它为主题
中的每个元素生成一个文档,我们称之为主题
。因此,我们为每个主题
都有一个文档,文档本身包含数组类型
和类型
。在展开genres
之后,每个主题
文档都会展开,以包含genres
中的元素genres
。这将使用类型。每个主题的长度
副本
——也就是说,根据数组中的类型
数量复制每个主题。展开类型时也会出现类似情况
简言之,您在每次$unwind
通话中都会复制数据
用一个简单的例子来说明:
// Doc:
{
ints: [1, 2],
alpha: ['a', 'b', 'c']
}
// Pipeline:
[
{ $unwind: "$ints" },
{ $unwind: "$alpha" }
]
// After unwinding "ints":
[
{ ints: 1, alpha: ['a', 'b', 'c'] },
{ ints: 2, alpha: ['a', 'b', 'c'] }
]
// After unwinding "alpha":
[
{ ints: 1, alpha: 'a' },
{ ints: 1, alpha: 'b' },
{ ints: 1, alpha: 'c' },
{ ints: 2, alpha: 'a' },
{ ints: 2, alpha: 'b' },
{ ints: 2, alpha: 'c' }
]
// Result: 3 duplicates of each value in "ints", 2 duplicates of each value in "alpha".
要解决这个问题,我马上想到了几个选项:
1.您可以$unwind
数组、$sort
数组和$group
将结果推回到$push
数组中,对每个数组逐个重复,一次一个。请注意,分组时,您需要使用$first
运算符仅获取每个重复数组的一个副本。
2.您可以将上一个$group
管道阶段更改为使用$addToSet
而不是$push
操作
您可能还有其他选择,但以上任何一项都足以快速完成任务。此外,样本采集后和输出我可以看到样本数据中的所有三种类型都相同。但在输出中,您显示了两个$addToSet
应设置为单个generes
。或者我说错话了?谢谢你提供的关于数据复制的提示!它解决了我的问题!
// Doc:
{
ints: [1, 2],
alpha: ['a', 'b', 'c']
}
// Pipeline:
[
{ $unwind: "$ints" },
{ $unwind: "$alpha" }
]
// After unwinding "ints":
[
{ ints: 1, alpha: ['a', 'b', 'c'] },
{ ints: 2, alpha: ['a', 'b', 'c'] }
]
// After unwinding "alpha":
[
{ ints: 1, alpha: 'a' },
{ ints: 1, alpha: 'b' },
{ ints: 1, alpha: 'c' },
{ ints: 2, alpha: 'a' },
{ ints: 2, alpha: 'b' },
{ ints: 2, alpha: 'c' }
]
// Result: 3 duplicates of each value in "ints", 2 duplicates of each value in "alpha".