Mongodb Mongo聚合$group条件$eq用于字符串字段

Mongodb Mongo聚合$group条件$eq用于字符串字段,mongodb,mongodb-query,aggregation-framework,Mongodb,Mongodb Query,Aggregation Framework,我使用Mongo v2.2.0 我编写了查询,但主要问题是$arrayElemAt。用$unwind-$first进行标准替换对我来说不起作用,我认为存在更好的解决方案。我有一个限制,可以将此聚合管道作为单个操作运行,而不是对正和负数据运行查询,然后在代码中合并结果。我需要为生成的查询应用$sort、$limit和$skip,以限制单词的计数用于过滤来自其他集合的记录,并在Java代码中合并来自这两个集合的数据 聚合查询: [ { $match: { "merchantI

我使用Mongo v2.2.0

我编写了查询,但主要问题是$arrayElemAt。用$unwind-$first进行标准替换对我来说不起作用,我认为存在更好的解决方案。我有一个限制,可以将此聚合管道作为单个操作运行,而不是对数据运行查询,然后在代码中合并结果。我需要为生成的查询应用$sort、$limit和$skip,以限制
单词的计数
用于过滤来自其他集合的记录,并在Java代码中合并来自这两个集合的数据

聚合查询:

[
  {
    $match: {
      "merchantId": ObjectId("59520e6ccc7a701fbed31f94"),
      "date": {
        "$gte": NumberLong(1389644800000),
        "$lt": NumberLong(1502409599999)
      },
      "isbn": "a123",

    }
  },
  {
    $project: {
      "word": 1,
      "sentence": 1,
      "type": 1,
      "date": 1
    }
  },
  {
    $sort: {
      "date": -1
    }
  },
  {
    $group: {
      "_id": {
        "word": "$word",
        "type": "$type"
      },
      "date": {
          $max: "$date"
      },
      "sentence": {
        $first: "$sentence"
      },
      "sentenceCount": {
        "$sum": 1
      }
    },    
  },
  {
    $group: {
            "_id": "$_id.word",
            "word": { $first: "$_id.word"},
            "positiveCount": {$sum: {$cond: [{$eq: ["$_id.type", "positive"]}, "$sentenceCount", 0]}},
            "count": {$sum: "$sentenceCount"},
            "positiveSentence": {
                "$push": {
                    "$cond": [{$eq: ["$_id.type", "positive"]}, "$sentence", "$noval"] 
                }
            },
            "negativeSentence": {
                "$push": {
                    "$cond": [{$eq: ["$_id.type", "negative"]}, "$sentence", "$noval"] 
                }
            }
    }
  },
  {
    $project: {
            "_id": 0,
            "word": 1,
            "sentimentPercentage": {$cond: [{$eq: ["$count", 0]}, 0, {$multiply: [{$divide: ["$positiveCount", "$count"]}, 100]}]},
            "positiveSentence": {$arrayElemAt: ["$positiveSentence", 0]},
            "negativeSentence": {$arrayElemAt: ["$negativeSentence", 0]},
    }
  },
  {
    $sort: {
            sentimentPercentage: -1
    }
  },
  {
    $limit: 50
  }
]
集合文档“模式”:

预期产出:

{ 
    "word" : "expectations", 
    "sentimentPercentage" : 100.0, 
    "positiveSentence" : "The service exceeded our expectations."
},
{ 
    "word" : "representative", 
    "sentimentPercentage" : 87.5, 
    "positiveSentence" : "Excellent local representative, met the flight and gave us all the relevant information to ensure a great holiday.", 
    "negativeSentence" : "The representative at resort was poor."
},
{ 
    "word" : "seats", 
    "sentimentPercentage" : 0.0, 
    "negativeSentence" : "Long delay and pre booked seats were lost ."
}

请告诉我如何替换$arrayElemAt运算符,或者更好地使用Mongo的功能将此查询优化到所需的输出这似乎给了我合理的结果。但是,我认为它不会正常工作,因为在v2.2中的stage不支持
preserveNullAndEmptyArrays
参数,所以您没有肯定句或否定句

db.getCollection('test').aggregate([
  {
    $project: {
      "word": 1,
      "sentence": 1,
      "type": 1,
      "date": 1
    }
  },
  {
    $sort: {
      "date": -1
    }
  },
  {
    $group: {
      "_id": {
        "word": "$word",
        "type": "$type"
      },
      "date": {
          $max: "$date"
      },
      "sentence": {
        $first: "$sentence"
      },
      "sentenceCount": {
        "$sum": 1
      }
    },    
  },
  {
    $group: {
            "_id": "$_id.word",
            "word": { $first: "$_id.word"},
            "positiveCount": {$sum: {$cond: [{$eq: ["$_id.type", "positive"]}, "$sentenceCount", 0]}},
            "count": {$sum: "$sentenceCount"},
            "positiveSentence": {
                "$push": {
                    "$cond": [{$eq: ["$_id.type", "positive"]}, "$sentence", "$noval"] 
                }
            },
            "negativeSentence": {
                "$push": {
                    "$cond": [{$eq: ["$_id.type", "negative"]}, "$sentence", "$noval"] 
                }
            }
    }
  },
  { $unwind: "$positiveSentence" },
  { $group: 
      {
          "_id": "$_id",
          "word": { $first: "$word" },
          "count": { $first: "$count" },
          "positiveCount": { $first: "$positiveCount" },
          "positiveSentence": { $first: "$positiveSentence" },
          "negativeSentence": { $first: "$negativeSentence" },
      }
  },
  { $unwind: "$negativeSentence" },
  { $group: 
      {
          "_id": "$_id",
          "word": { $first: "$word" },
          "count": { $first: "$count" },
          "positiveCount": { $first: "$positiveCount" },
          "positiveSentence": { $first: "$positiveSentence" },
          "negativeSentence": { $first: "$negativeSentence" },
      }
  },
  {
    $project: {
            "_id": 0,
            "word": 1,
            "sentimentPercentage": {$cond: [{$eq: ["$count", 0]}, 0, {$multiply: [{$divide: ["$positiveCount", "$count"]}, 100]}]},
            "positiveSentence": 1,
            "negativeSentence": 1
    }
  }
])

您可以进一步简化此过程,例如,取消第一个投影和分组阶段。如果您愿意的话,我可能会在几个小时内对此进行调查。

您可能会提供一些示例数据和所需的输出吗?@dnickless我已经提供了预期输出的示例幸运的是,我发现我们将db版本升级到了3.4.4+。但我仍然对如何解决这个难题感兴趣。别着急,会等着的
db.getCollection('test').aggregate([
  {
    $project: {
      "word": 1,
      "sentence": 1,
      "type": 1,
      "date": 1
    }
  },
  {
    $sort: {
      "date": -1
    }
  },
  {
    $group: {
      "_id": {
        "word": "$word",
        "type": "$type"
      },
      "date": {
          $max: "$date"
      },
      "sentence": {
        $first: "$sentence"
      },
      "sentenceCount": {
        "$sum": 1
      }
    },    
  },
  {
    $group: {
            "_id": "$_id.word",
            "word": { $first: "$_id.word"},
            "positiveCount": {$sum: {$cond: [{$eq: ["$_id.type", "positive"]}, "$sentenceCount", 0]}},
            "count": {$sum: "$sentenceCount"},
            "positiveSentence": {
                "$push": {
                    "$cond": [{$eq: ["$_id.type", "positive"]}, "$sentence", "$noval"] 
                }
            },
            "negativeSentence": {
                "$push": {
                    "$cond": [{$eq: ["$_id.type", "negative"]}, "$sentence", "$noval"] 
                }
            }
    }
  },
  { $unwind: "$positiveSentence" },
  { $group: 
      {
          "_id": "$_id",
          "word": { $first: "$word" },
          "count": { $first: "$count" },
          "positiveCount": { $first: "$positiveCount" },
          "positiveSentence": { $first: "$positiveSentence" },
          "negativeSentence": { $first: "$negativeSentence" },
      }
  },
  { $unwind: "$negativeSentence" },
  { $group: 
      {
          "_id": "$_id",
          "word": { $first: "$word" },
          "count": { $first: "$count" },
          "positiveCount": { $first: "$positiveCount" },
          "positiveSentence": { $first: "$positiveSentence" },
          "negativeSentence": { $first: "$negativeSentence" },
      }
  },
  {
    $project: {
            "_id": 0,
            "word": 1,
            "sentimentPercentage": {$cond: [{$eq: ["$count", 0]}, 0, {$multiply: [{$divide: ["$positiveCount", "$count"]}, 100]}]},
            "positiveSentence": 1,
            "negativeSentence": 1
    }
  }
])