Node.js Mongoose:计算作为$lookup聚合步骤的结果返回的数组中不同字段值的数目

Node.js Mongoose:计算作为$lookup聚合步骤的结果返回的数组中不同字段值的数目,node.js,mongodb,mongoose,Node.js,Mongodb,Mongoose,我正在努力解决一个问题,即如何获取数组中不同字段值的数量,这些值是使用Mongoose在MongoDB中执行$lookup聚合步骤后返回的。我所说的不同字段值的数量是指在某个字段上具有唯一值的行的数量 父文档具有以下结构: {u id:678,名称:“abc”} 子文档具有以下结构: {u id:1009,fieldA:123,x:{id:678,name:“abc”} $lookup步骤定义如下: { from "children", localField: "_id" foreignF

我正在努力解决一个问题,即如何获取数组中不同字段值的数量,这些值是使用Mongoose在MongoDB中执行
$lookup
聚合步骤后返回的。我所说的不同字段值的数量是指在某个字段上具有唯一值的行的数量

父文档具有以下结构:
{u id:678,名称:“abc”}

子文档具有以下结构:
{u id:1009,fieldA:123,x:{id:678,name:“abc”}

$lookup
步骤定义如下:

{
 from "children",
 localField: "_id"
 foreignField: "x.id"
 as: "xyz"
}
让我们假设我得到这个数组是由于
$lookup
聚合步骤为
\u id
等于:678

xyz: [ 
{ _id: 1009, fieldA: 123, x: { id: 678, name: "abc" } }, 
{ _id: 1010, fieldA: 3435, x: { id: 678, name: "abc" } }, 
{ _id: 1011, fieldA: 123, x: { id: 678, name: "abc" } } 
]
我想知道这个数组中有多少个不同的
fieldA
值。在这个例子中,它将是2


当然,步骤应该在聚合流中,在
$lookup
步骤之后和之前(内部?
$project
步骤)。作为旁注,我必须补充一点,我还需要数组中的元素总数
xyz
作为另一个值(
$size
操作符在
$project
步骤中)。

因此,根据您所说的,您基本上会有如下数据:

{ "$lookup": {
  "from": "children",
  "localField": "xyz",
  "foreignField": "_id"
  "as": "xyz"
}}
家长

{
  "_id": 1,
  "xyz": ["abc", "abd", "abe", "abf"]
}
儿童

{ "_id": "abc", "fieldA": 123 },
{ "_id": "abd", "fieldA": 34 },
{ "_id": "abe", "fieldA": 123 },
{ "_id": "abf", "fieldA": 54 }
N.B.如果您实际上在子对象中定义了父对象引用,而不是在父对象中定义了子对象引用数组,则底部有一个列表示例。然而,同样的原则通常适用于这两种情况

当你的电流产生问题中那样的结果时,会是这样的:

{ "$lookup": {
  "from": "children",
  "localField": "xyz",
  "foreignField": "_id"
  "as": "xyz"
}}
最佳方法 现在,您可以对返回的数组执行其他操作,以便实际返回总计数和不同计数,但对于任何现代MongoDB发行版,您都应该使用一种更好的方法。也就是说,有一种更具表现力的形式,允许指定
管道
,以作用于结果子级:

Parent.aggregate([
  { "$lookup": {
    "from": "children",
    "let": { "ids": "$xyz" },
    "pipeline": [
      { "$match": {
        "$expr": { "$in": [ "$_id", "$$ids" ] }
      }},
      { "$group": {
        "_id": "$fieldA",
        "total": { "$sum": 1 }
      }},
      { "$group": {
        "_id": null,
        "distinct": { "$sum": 1 },
        "total": { "$sum": "$total" }
      }}
    ],
    "as": "xyz"
  }},
  { "$addFields": {
    "xyz": "$$REMOVE",
    "distinctCount": { "$sum": "$xyz.distinct" },
    "totalCount": { "$sum": "$xyz.total" }
  }}
])
const { Schema } = mongoose = require('mongoose');

const uri = 'mongodb://localhost:27017/test';
const options = { useNewUrlParser: true, useUnifiedTopology: true };

mongoose.set('debug', true);
mongoose.set('useFindAndModify', false);
mongoose.set('useCreateIndex', true);


const parentSchema = new Schema({
  _id: Number,
},{ _id: false });

parentSchema.virtual("xyz", {
  ref: 'Child',
  localField: '_id',
  foreignField: 'parent',
  justOne: false
});

const childSchema = new Schema({
  _id: String,
  parent: Number,
  fieldA: Number
},{ _id: false });

childSchema.index({ "parent": 1 });


const Parent = mongoose.model('Parent', parentSchema);
const Child = mongoose.model('Child', childSchema);


const log = data => console.log(JSON.stringify(data, undefined, 2));

(async function() {

  try {

    const conn = await mongoose.connect(uri, options);

    // Clean data for demonstration
    await Promise.all(
      Object.values(conn.models).map(m => m.deleteMany())
    );


    // Insert some data
    await Parent.create({ "_id": 1 });
    await Child.insertMany([
     { "_id": "abc", "fieldA": 123 },
     { "_id": "abd", "fieldA": 34 },
     { "_id": "abe", "fieldA": 123 },
     { "_id": "abf", "fieldA": 54 }
    ].map(e => ({ ...e, "parent": 1 })));


    let result1 = await Parent.aggregate([
      { "$lookup": {
        "from": Child.collection.name,
        "let": { "parent": "$_id" },
        "pipeline": [
          { "$match": {
            "$expr": { "$eq": [ "$parent", "$$parent" ] }
          }},
          { "$group": {
            "_id": "$fieldA",
            "total": { "$sum": 1 }
          }},
          { "$group": {
            "_id": null,
            "distinct": { "$sum": 1 },
            "total": { "$sum": "$total" }
          }}
        ],
        "as": "xyz"
      }},
      { "$addFields": {
        "xyz": "$$REMOVE",
        "distinctCount": { "$sum": "$xyz.distinct" },
        "totalCount": { "$sum": "$xyz.total" }

      }}
    ]);

    log({ result1 });

    let result2 = await Parent.aggregate([
      { "$lookup": {
        "from": Child.collection.name,
        "localField": "_id",
        "foreignField": "parent",
        "as": "xyz"
      }},
      { "$addFields": {
        "xyz": "$$REMOVE",
        "distinctCount": { "$size": { "$setUnion": [ [], "$xyz.fieldA" ] } },
        "totalCount": { "$size": "$xyz" }
      }}
    ]);

    log({ result2 })

  } catch(e) {
    console.error(e);
  } finally {
    mongoose.disconnect();
  }


})()
这里的要点是,您实际上不需要从返回所有数组结果,因此,您不需要使用所有匹配子级的返回数组,而只需要从的
管道
表达式中减少该内容

为了获得内部内容的总计数和不同计数,在指定“连接”和要返回的匹配项的初始条件之后,您将使用“不同”值作为键,并维护在total中找到的元素的“计数”。第二种方法对键使用
null
值,因为现在唯一需要的是已返回的不同键的计数,当然还需要返回已计数元素的现有总数

结果当然是:

{
  "_id": 1,
  "distinctCount": 3,
  "totalCount": 4
}
由于我们使用的是除了父文档中存在的所有其他字段之外的所有字段,除了我们通过
$$REMOVE
操作符显式删除的
xyz

您还可以注意到在最后一个阶段中的用法。我们管道的实际结果当然是一个单个文档,但它总是在一个数组中,因为这就是“总是”的输出。在本例中,这只是一种非常简单的方法(最短语法),只需将这些值作为父文档中的单个字段从数组中提取即可

候补 当然,另一种方法是只处理返回的数组,而这实际上只需要适当的
$size
操作符:

Parent.aggregate([
  { "$lookup": {
    "from": "children",
    "localField": "xyz",
    "foreignField": "_id",
    "as": "xyz"
  }},
  { "$addFields": {
    "xyz": "$$REMOVE",
    "distinctCount": { "$size": { "$setUnion": [ [], "$xyz.fieldA" ] }},
    "totalCount": { "$size": "$xyz" }
  }}
])
这里我们主要使用提供空数组
[]
字段a
值数组的参数。由于这将返回一个由两个参数组合而成的“集”,因此定义“集”的一点是,值只能出现一次,因此是不同的。这是一种仅获取不同值的快速方法,当然,每个“数组”(或“集合”)只需通过测量它们各自的计数即可

所以它“看起来很简单”,但问题是它不是真正有效的,主要是因为我们花了大量的操作时间从返回那些数组值,然后我们基本上放弃了结果。这就是为什么首选前一种方法,因为它实际上会在结果作为数组返回之前减少结果。因此,总体而言,“更少的工作”

另一方面,如果您确实希望保留从结果返回的数组,那么后一种情况当然更可取


示例列表 以及输出:

Mongoose: parents.createIndex({ xyz: 1 }, { background: true })
Mongoose: parents.deleteMany({}, {})
Mongoose: children.deleteMany({}, {})
Mongoose: parents.insertOne({ xyz: [ 'abc', 'abd', 'abe', 'abf' ], _id: 1, __v: 0 }, { session: null })
Mongoose: children.insertMany([ { _id: 'abc', fieldA: 123, __v: 0 }, { _id: 'abd', fieldA: 34, __v: 0 }, { _id: 'abe', fieldA: 123, __v: 0 }, { _id: 'abf', fieldA: 54, __v: 0 }], {})
Mongoose: parents.aggregate([ { '$lookup': { from: 'children', let: { ids: '$xyz' }, pipeline: [ { '$match': { '$expr': { '$in': [ '$_id', '$$ids' ] } } }, { '$group': { _id: '$fieldA', total: { '$sum': 1 } } }, { '$group': { _id: null, distinct: { '$sum': 1 }, total: { '$sum': '$total' } } } ], as: 'xyz' } }, { '$addFields': { xyz: '$$REMOVE', distinctCount: { '$sum': '$xyz.distinct' }, totalCount: { '$sum': '$xyz.total' } } }], {})
{
  "result1": [
    {
      "_id": 1,
      "__v": 0,
      "distinctCount": 3,
      "totalCount": 4
    }
  ]
}
Mongoose: parents.aggregate([ { '$lookup': { from: 'children', localField: 'xyz', foreignField: '_id', as: 'xyz' } }, { '$addFields': { xyz: '$$REMOVE', distinctCount: { '$size': { '$setUnion': [ [], '$xyz.fieldA' ] } }, totalCount: { '$size': '$xyz' } } }], {})
{
  "result2": [
    {
      "_id": 1,
      "__v": 0,
      "distinctCount": 3,
      "totalCount": 4
    }
  ]
}
Mongoose: children.createIndex({ parent: 1 }, { background: true })
Mongoose: parents.deleteMany({}, {})
Mongoose: children.deleteMany({}, {})
Mongoose: parents.insertOne({ _id: 1, __v: 0 }, { session: null })
Mongoose: children.insertMany([ { _id: 'abc', fieldA: 123, parent: 1, __v: 0 }, { _id: 'abd', fieldA: 34, parent: 1, __v: 0 }, { _id: 'abe', fieldA: 123, parent: 1, __v: 0 }, { _id: 'abf', fieldA: 54, parent: 1, __v: 0 }], {})
Mongoose: parents.aggregate([ { '$lookup': { from: 'children', let: { parent: '$_id' }, pipeline: [ { '$match': { '$expr': { '$eq': [ '$parent', '$$parent' ] } } }, { '$group': { _id: '$fieldA', total: { '$sum': 1 } } }, { '$group': { _id: null, distinct: { '$sum': 1 }, total: { '$sum': '$total' } } } ], as: 'xyz' } }, { '$addFields': { xyz: '$$REMOVE', distinctCount: { '$sum': '$xyz.distinct' }, totalCount: { '$sum': '$xyz.total' } } }], {})
{
  "result1": [
    {
      "_id": 1,
      "__v": 0,
      "distinctCount": 3,
      "totalCount": 4
    }
  ]
}
Mongoose: parents.aggregate([ { '$lookup': { from: 'children', localField: '_id', foreignField: 'parent', as: 'xyz' } }, { '$addFields': { xyz: '$$REMOVE', distinctCount: { '$size': { '$setUnion': [ [], '$xyz.fieldA' ] } }, totalCount: { '$size': '$xyz' } } }], {})
{
  "result2": [
    {
      "_id": 1,
      "__v": 0,
      "distinctCount": 3,
      "totalCount": 4
    }
  ]
}
父数组中没有子数组的示例 显示了在父级中不使用值数组定义架构,而是在所有子级中定义父引用:

Parent.aggregate([
  { "$lookup": {
    "from": "children",
    "let": { "ids": "$xyz" },
    "pipeline": [
      { "$match": {
        "$expr": { "$in": [ "$_id", "$$ids" ] }
      }},
      { "$group": {
        "_id": "$fieldA",
        "total": { "$sum": 1 }
      }},
      { "$group": {
        "_id": null,
        "distinct": { "$sum": 1 },
        "total": { "$sum": "$total" }
      }}
    ],
    "as": "xyz"
  }},
  { "$addFields": {
    "xyz": "$$REMOVE",
    "distinctCount": { "$sum": "$xyz.distinct" },
    "totalCount": { "$sum": "$xyz.total" }
  }}
])
const { Schema } = mongoose = require('mongoose');

const uri = 'mongodb://localhost:27017/test';
const options = { useNewUrlParser: true, useUnifiedTopology: true };

mongoose.set('debug', true);
mongoose.set('useFindAndModify', false);
mongoose.set('useCreateIndex', true);


const parentSchema = new Schema({
  _id: Number,
},{ _id: false });

parentSchema.virtual("xyz", {
  ref: 'Child',
  localField: '_id',
  foreignField: 'parent',
  justOne: false
});

const childSchema = new Schema({
  _id: String,
  parent: Number,
  fieldA: Number
},{ _id: false });

childSchema.index({ "parent": 1 });


const Parent = mongoose.model('Parent', parentSchema);
const Child = mongoose.model('Child', childSchema);


const log = data => console.log(JSON.stringify(data, undefined, 2));

(async function() {

  try {

    const conn = await mongoose.connect(uri, options);

    // Clean data for demonstration
    await Promise.all(
      Object.values(conn.models).map(m => m.deleteMany())
    );


    // Insert some data
    await Parent.create({ "_id": 1 });
    await Child.insertMany([
     { "_id": "abc", "fieldA": 123 },
     { "_id": "abd", "fieldA": 34 },
     { "_id": "abe", "fieldA": 123 },
     { "_id": "abf", "fieldA": 54 }
    ].map(e => ({ ...e, "parent": 1 })));


    let result1 = await Parent.aggregate([
      { "$lookup": {
        "from": Child.collection.name,
        "let": { "parent": "$_id" },
        "pipeline": [
          { "$match": {
            "$expr": { "$eq": [ "$parent", "$$parent" ] }
          }},
          { "$group": {
            "_id": "$fieldA",
            "total": { "$sum": 1 }
          }},
          { "$group": {
            "_id": null,
            "distinct": { "$sum": 1 },
            "total": { "$sum": "$total" }
          }}
        ],
        "as": "xyz"
      }},
      { "$addFields": {
        "xyz": "$$REMOVE",
        "distinctCount": { "$sum": "$xyz.distinct" },
        "totalCount": { "$sum": "$xyz.total" }

      }}
    ]);

    log({ result1 });

    let result2 = await Parent.aggregate([
      { "$lookup": {
        "from": Child.collection.name,
        "localField": "_id",
        "foreignField": "parent",
        "as": "xyz"
      }},
      { "$addFields": {
        "xyz": "$$REMOVE",
        "distinctCount": { "$size": { "$setUnion": [ [], "$xyz.fieldA" ] } },
        "totalCount": { "$size": "$xyz" }
      }}
    ]);

    log({ result2 })

  } catch(e) {
    console.error(e);
  } finally {
    mongoose.disconnect();
  }


})()
以及输出:

Mongoose: parents.createIndex({ xyz: 1 }, { background: true })
Mongoose: parents.deleteMany({}, {})
Mongoose: children.deleteMany({}, {})
Mongoose: parents.insertOne({ xyz: [ 'abc', 'abd', 'abe', 'abf' ], _id: 1, __v: 0 }, { session: null })
Mongoose: children.insertMany([ { _id: 'abc', fieldA: 123, __v: 0 }, { _id: 'abd', fieldA: 34, __v: 0 }, { _id: 'abe', fieldA: 123, __v: 0 }, { _id: 'abf', fieldA: 54, __v: 0 }], {})
Mongoose: parents.aggregate([ { '$lookup': { from: 'children', let: { ids: '$xyz' }, pipeline: [ { '$match': { '$expr': { '$in': [ '$_id', '$$ids' ] } } }, { '$group': { _id: '$fieldA', total: { '$sum': 1 } } }, { '$group': { _id: null, distinct: { '$sum': 1 }, total: { '$sum': '$total' } } } ], as: 'xyz' } }, { '$addFields': { xyz: '$$REMOVE', distinctCount: { '$sum': '$xyz.distinct' }, totalCount: { '$sum': '$xyz.total' } } }], {})
{
  "result1": [
    {
      "_id": 1,
      "__v": 0,
      "distinctCount": 3,
      "totalCount": 4
    }
  ]
}
Mongoose: parents.aggregate([ { '$lookup': { from: 'children', localField: 'xyz', foreignField: '_id', as: 'xyz' } }, { '$addFields': { xyz: '$$REMOVE', distinctCount: { '$size': { '$setUnion': [ [], '$xyz.fieldA' ] } }, totalCount: { '$size': '$xyz' } } }], {})
{
  "result2": [
    {
      "_id": 1,
      "__v": 0,
      "distinctCount": 3,
      "totalCount": 4
    }
  ]
}
Mongoose: children.createIndex({ parent: 1 }, { background: true })
Mongoose: parents.deleteMany({}, {})
Mongoose: children.deleteMany({}, {})
Mongoose: parents.insertOne({ _id: 1, __v: 0 }, { session: null })
Mongoose: children.insertMany([ { _id: 'abc', fieldA: 123, parent: 1, __v: 0 }, { _id: 'abd', fieldA: 34, parent: 1, __v: 0 }, { _id: 'abe', fieldA: 123, parent: 1, __v: 0 }, { _id: 'abf', fieldA: 54, parent: 1, __v: 0 }], {})
Mongoose: parents.aggregate([ { '$lookup': { from: 'children', let: { parent: '$_id' }, pipeline: [ { '$match': { '$expr': { '$eq': [ '$parent', '$$parent' ] } } }, { '$group': { _id: '$fieldA', total: { '$sum': 1 } } }, { '$group': { _id: null, distinct: { '$sum': 1 }, total: { '$sum': '$total' } } } ], as: 'xyz' } }, { '$addFields': { xyz: '$$REMOVE', distinctCount: { '$sum': '$xyz.distinct' }, totalCount: { '$sum': '$xyz.total' } } }], {})
{
  "result1": [
    {
      "_id": 1,
      "__v": 0,
      "distinctCount": 3,
      "totalCount": 4
    }
  ]
}
Mongoose: parents.aggregate([ { '$lookup': { from: 'children', localField: '_id', foreignField: 'parent', as: 'xyz' } }, { '$addFields': { xyz: '$$REMOVE', distinctCount: { '$size': { '$setUnion': [ [], '$xyz.fieldA' ] } }, totalCount: { '$size': '$xyz' } } }], {})
{
  "result2": [
    {
      "_id": 1,
      "__v": 0,
      "distinctCount": 3,
      "totalCount": 4
    }
  ]
}

我最终采用了@Neil Lunn建议的第一种方法。由于我的父母和孩子模式与@Neil Lunn假设的不同,我发布了我自己的答案,whitch解决了我的特殊问题:

Parent.aggregate([
      {
        $lookup: {
          from: "children",
          let: { id: "$_id" },
          pipeline: [
            { $match: { $expr: { $eq: ["$x.id", "$$id"] } } },
            {
              $group: {
                _id: "$fieldA",
                count: { $sum: 1 }
              }
            },
            {
              $group: {
                _id: null,
                fieldA: { $sum: 1 },
                count: { $sum: "$count" }
              }
            }
          ],
          as: "children"
        }
      },
      {
        $project: {
           total: { $sum: "$children.count" },
           distinct: { $sum: "$children.fieldA" }
        }
      }
    ]);

你到底试过什么?。如果你真的证明你已经做出了一些努力来解决这个问题,你可能会发现人们更愿意帮助你,包括你的尝试,展示什么不适合你,什么是你期望的。此外,“当然,该步骤应该在聚合流中,
$lookup
…”。作为一个重要提示,这种说法实际上并不是最优的。如果您只需要单一的“总计”和“不同”值,那么在之后执行该操作是最糟糕的情况。现代MongoDB发行版有一个“内部”版本。你说的“内部”是什么意思?据我所知,$lookup始终返回一个值数组,因此使用此运算符无法仅返回不同值的数量和字段的总数。