Mongodb 正确分组聚合管道中的文档以查找集合交叉点_Mongodb

Mongodb 正确分组聚合管道中的文档以查找集合交叉点

mongodb

Mongodb 正确分组聚合管道中的文档以查找集合交叉点,mongodb,Mongodb,假设我有这两份文件： { "_id":"sampleA", "value":{ "data":[ { "thing":"A" }, { "thing":"B" }, { "thing":"C" }, { "thing":"D"

假设我有这两份文件：

{  
   "_id":"sampleA",
   "value":{  
      "data":[  
         {  
            "thing":"A"
         },
         {  
            "thing":"B"
         },
         {  
            "thing":"C"
         },
         {  
            "thing":"D"
         },
         {  
            "thing":"E"
         }
      ]
   }
}

 {  
   "_id":"sampleB",
   "value":{  
      "data":[  
         {  
            "thing":"C"
         },
         {  
            "thing":"D"
         },
         {  
            "thing":"E"
         },
         {  
            "thing":"F"
         }
      ]
   }
}

我想把它们组合成一个文档，保留“samam辩诉”或“sampleB”的标签，例如

这样我就可以使用set intersection操作符了。我该怎么做呢？我试过：

db.testz.aggregate(
      [{
        $match: {
          _id: {
            $in: ["sampleA", "sampleB"]
          }
        }
      }, {
        '$group': {
          _id: null,
          a: {
            $push: "$value"
          }
        }
      }]
    );

这让我

{
  "_id": null,
  "a": [
    {
      "data": [
        {
          "thing": "A"
        },
        {
          "thing": "B"
        },
        {
          "thing": "C"
        },
        {
          "thing": "D"
        },
        {
          "thing": "E"
        }
      ]
    },
    {
      "data": [
        {
          "thing": "C"
        },
        {
          "thing": "D"
        },
        {
          "thing": "E"
        },
        {
          "thing": "F"
        }
      ]
    }
  ]
}

假设我可以使用set intersection操作符，如果我可以在

    db.testz.aggregate(
      [{
        $match: {
          _id: {
            $in: ["sampleA", "sampleB"]
          }
        }
      }, {
        '$group': {
          _id: null,
          a: {
            $push: "$value"
          }
        }
      }, {
        '$project': {
          int: {
            $setIntersection: ["$a.0", "$a.1"]
          }
        }
      }]
    );

^^显然，这最后一步不起作用，但我试图说明这一点

我认为目前唯一的方法（MongoDB 2.6）是展开阵列，然后在集合中重新收集：

> db.testz.aggregate([
    { "$match" : { "_id" : { "$in" : ["sampleA", "sampleB"] } } },
    { "$unwind" : "$value.data" },
    { "$group" : { "_id" : 0, "intersection" : { "$addToSet" : "$value.data" } } }
])

这不是一种有效的方法，但它能完成任务。我一直在向你询问更具体的信息，看看是否有办法避免这个答案：（

你能解释一下这个操作的总体目的是什么吗？你想相交的是哪些集合？两个文档中的两个数组？整个集合中某些文档对中的数组？如果是前者，为什么不在客户端代码中执行这个小操作？如果是后者，我们需要更多解释你是如何配对文档的nts和set intersection操作是什么，用于什么。前者…只是试图找到两个数据数组的交集。该操作并不总是很小，在某个点上，数据数组将包含数千个内容。此外，我需要找到两个以上“数据”数组的交集（但格式相同）.那么你想要集合中所有数组的交集，一个大的交集？你想要什么还不清楚。从我在文章开头列出的第一个文档类型开始，假设我有多达n个文档，它们都包含长度可能达到数千的数据。我需要弄清楚如何将其中的几个数据数组组合到一个文档中，这样我就可以对它们使用set intersection操作符。

> db.testz.aggregate([
    { "$match" : { "_id" : { "$in" : ["sampleA", "sampleB"] } } },
    { "$unwind" : "$value.data" },
    { "$group" : { "_id" : 0, "intersection" : { "$addToSet" : "$value.data" } } }
])