Mongodb 根据以前对象的值筛选有序对象数组的特定元素（聚合框架）_Mongodb_Mongodb Query_Aggregation Framework

Mongodb 根据以前对象的值筛选有序对象数组的特定元素（聚合框架）

mongodb

Mongodb 根据以前对象的值筛选有序对象数组的特定元素（聚合框架）,mongodb,mongodb-query,aggregation-framework,Mongodb,Mongodb Query,Aggregation Framework,我有以下文件： [ { '_id': 1, 'role': [ { // keep this document 'plan': 'free', 'date': ISODate('2020-01-01') }, { 'plan': 'free', 'dat

我有以下文件：

[
    {
        '_id': 1,
        'role': [
            {  // keep this document
                'plan': 'free',
                'date': ISODate('2020-01-01')
            },
            {  
                'plan': 'free',
                'date': ISODate('2020-01-02')
            },
            {
                'plan': 'free',
                'date': ISODate('2020-01-03')
            },
            {  // keep this document
                'plan': 'pro',
                'date': ISODate('2020-01-04')
            },
            {
                'plan': 'pro',
                'date': ISODate('2020-01-05')
            },
            {
                'plan': 'pro',
                'date': ISODate('2020-01-06')
            },
            {  // keep this document
                'plan': 'free',
                'date': ISODate('2020-01-08')
            },
            {
                'plan': 'free',
                'date': ISODate('2020-01-09')
            }
        ]
    },
    {
        '_id': 2,
        'role': [
            {  // keep this document
                'plan': 'pro',
                'date': ISODate('2020-02-05')
            },
            {
                'plan': 'pro',
                'date': ISODate('2020-02-06')
            },
            {  // keep this document
                'plan': 'free',
                'date': ISODate('2020-02-07')
            },
            {
                'plan': 'free',
                'date': ISODate('2020-02-08')
            },
            {
                'plan': 'free',
                'date': ISODate('2020-02-09')
            },
            {  // keep this document
                'plan': 'pro',
                'date': ISODate('2020-02-10')
            },
            {
                'plan': 'pro',
                'date': ISODate('2020-02-11')
            },
            {
                'plan': 'pro',
                'date': ISODate('2020-02-12')
            }
        ]
    }
]

因此，我必须根据

plan

字段值的变化来过滤文档。我总是想保留第一次出现的文档，但只有当

计划

字段的值发生更改时（例如

免费

更改为

专业

，或

专业

更改为

免费

），下一个文档才会被保留

Obs.：我在

计划

字段中有更多不同的值（例如

高级

，

管理

等），但我只得到了两个文档作为示例。

我认为如果在大型数据集上执行此操作，那么数据集具有

角色

数组，并且其中包含大量对象，那么此操作可能是一种过度杀伤力。您可以尝试以下聚合查询：

db.collection.aggregate([
    /** As `role` field already exists `$addFields` will overwrite with new value */
    {
      $addFields: {
        role: {
          $let: {
            vars: {
              data: {
                  $reduce: {
                  input: { $slice: [ "$role", 1, { $size: "$role" } ] }, /** array input without first object */
                  initialValue: { roleObjs: [ { $arrayElemAt: [ "$role", 0 ] } ], plan: { $arrayElemAt: [ "$role.plan", 0 ] } }, /** Pick first object & first object's plan as initial values */
                  in: {
                    roleObjs: { $cond: [ { $eq: [ "$$this.plan", "$$value.plan" ] }, "$$value.roleObjs", { $concatArrays: [ "$$value.roleObjs", [ "$$this" ] ] } ] }, /** Conditional check & merge new object to array or return holding array as is  */
                    plan: { $cond: [ { $eq: [ "$$this.plan", "$$value.plan" ] }, "$$value.plan", "$$this.plan" ] }
                  }
                }
              }
            },
            in: "$$data.roleObjs" /** Return newly formed `roleObjs` array in local variable */
          }
        }
      }
    }
  ])

测试：

这里是一个具有所需结果的聚合：

db.collection.aggregate( [ 
  { 
      $addFields: { 
          plans: { 
              $reduce: { 
                  input: "$role", 
                  initialValue: [], 
                  in: { $concatArrays: [ "$$value", [ "$$this.plan" ] ] } 
              } 
          } 
      } 
  },
  { 
      $addFields: {
          role: { 
              $reduce: {
                  input: { $range: [ 0, { $subtract: [ { $size: "$role" }, 1 ] } ] },
                  initialValue: { prevPlan: { $arrayElemAt: [ "$plans", 0 ] }, roles: [ { $arrayElemAt: [ "$role", 0 ] } ] },
                  in: {
                      $cond: [ { $eq: [ { $arrayElemAt: [ "$plans", "$$this"] }, "$$value.prevPlan" ] },
                               { prevPlan: { $arrayElemAt: [ "$plans", "$$this"] },
                                 roles: { $concatArrays: [ "$$value.roles", [ ] ] } 
                               },
                               { prevPlan: { $arrayElemAt: [ "$plans", "$$this" ] },
                                 roles: { $concatArrays: [ "$$value.roles", [ { $arrayElemAt: [ "$role", "$$this" ] } ] ] } 
                               }
                      ]
                  }
              }
          }
      }
  },
  { 
      $project: { role: "$role.roles" }
  }
] )

您可以使用聚合数组操作符来获得所需的结果。我看到了

$reduce

文档，似乎这确实可以解决问题。我尝试了这一点，但没有解决问题：

{$addFields'：{t:{$reduce:{'input'：'$roles'，'initialValue'：{$arrayElemAt:['$roles'，0]}，'in':{$cond:{'if':{$eq:['$$value.plan'，'$$this.plan']}，然后':'$$value'，'else':'$$this'}}}}

@igorkf:试试这个查询：是的，它就像我需要的那样工作。为什么你认为这可能是一个过度的杀伤力？@igorkf：和往常一样，列表/数组上的问题迭代在开发中总是一个问题，不管它是哪种语言/在哪里：-）如果数据集或数组的大小大于100 MB，我认为它可能执行缓慢，或者可能超过聚合的单个阶段的100 MB RAM限制（虽然这种操作可能很少见）。如果它对您的数据集非常有效，那么您就可以开始了：-）我知道了。也许这个问题是由于数据是如何构造的。我认为这可以避免计划一种更好的方法来构建这种类型的数据。谢谢您的帮助。@igorkf：是的，存储之前的数据设计非常重要&通常被忽略或监督-常见问题：-）