如何使用Java查询和筛选多个嵌套数组
我的目标是返回多个questionElements,其中questionElements元标记条目等于我的搜索。例如,如果metaTag元素等于my string,则返回它的父questionEntry元素并搜索show中嵌套的所有元素 因此,我想要的是匹配包含所需“metaTags”值的文档,并“过滤”不包含此内部匹配的任何子文档数组 这是我尝试过的聚合查询,但它没有给出我想要的结果:如何使用Java查询和筛选多个嵌套数组,java,mongodb,mongodb-query,aggregation-framework,Java,Mongodb,Mongodb Query,Aggregation Framework,我的目标是返回多个questionElements,其中questionElements元标记条目等于我的搜索。例如,如果metaTag元素等于my string,则返回它的父questionEntry元素并搜索show中嵌套的所有元素 因此,我想要的是匹配包含所需“metaTags”值的文档,并“过滤”不包含此内部匹配的任何子文档数组 这是我尝试过的聚合查询,但它没有给出我想要的结果: db.mongoColl.aggregate([{“$redact”:{“$cond”:{if:{$gt:[
db.mongoColl.aggregate([{“$redact”:{“$cond”:{if:{$gt:[{“$size”:{
$setIntersection:[{“$ifNull”:[“$metaTags”,[]},
[“MySearchString”]]},0]},然后:“$$PRUNE”,
else:“$$Descent”}}}]).pretty();
我的实例是:
私有数据库mongoDatabase;
私有数据库收集mongoColl;
私有DBObject-DBObject;
//独生子女班
//创建客户端(服务器地址(主机、端口)、凭据、选项)
mongoClient=新的mongoClient(新服务器地址(主机、端口),
集合。单音列表(凭证),
选择权);
mongoDatabase=ClientSingleton.getInstance().getClient().getDB(“MyDB”);
数据库中要匹配的我的文档是:
{
“表演”:[
{
“季节”:[
{
“情节”:[
{
“问题条目”:{
“id”:1,
“信息”:{
“季节编号”:1,
“情节编号”:5,
“诗集名”:“一个英雄坐在隔壁”
},
“问题项”:{
“问题”:“威德先生雇用的振铃人叫什么名字?”,
“附件”:{
“类型”:1,
“值”:”
}
},
“选择”:[
{
“类型”:1,
“价值”:“约翰逊”
},
{
“类型”:1,
“值”:“Hideo”
},
{
“类型”:1,
“值”:“Guillermo”
}
],
“答复”:{
“问题ID”:1,
“答案”:3
},
“元标记”:[
“第一季”,
“第五集”,
“琐事”,
“雅利娅·斯塔克”,
“斯塔克之家”
]
}
}
]
}
]
}
]
}
但是,如果文档中的任何数组不包含要匹配的“metaTags”值,即“Arya Stark”,那么我不希望在结果中匹配该数组的任何元素。“元标记”可以保持原样
我正在运行最新的java驱动程序,并在Eclipse中使用java SE1.7编译器(如果这对响应有任何影响)。您可以使用以下代码进行聚合:
mongoClient = new MongoClient("127.0.0.1", 27017);
DB db = mongoClient.getDB("db_name");
DBCollection dbCollection = db.getCollection("collection_name");
//make aggregation pipeline here
List<DBObject> pipeline = new ArrayList<DBObject>();
AggregationOutput output = dbCollection.aggregate(pipeline);
List<DBObject> results = (List<DBObject>) output.results();
//iterate this list and cast DBObject to your POJO
操作符并不是这里的最佳选择,或者说逻辑很简单,是导致尝试的查询无法工作的主要原因。“编校”选项基本上是针对单个特定条件的“全有或全无”过程,该条件可用于$$down
,从而遍历文档的各个级别
在编码中不存在字段的情况下,通过转置一个值,最多只能得到很多“误报”。在最坏的情况下,您最终会删除整个文档,相反,它可能是匹配的。它有它的用途,但这不是真正的用途之一
首先是基于您的结构的简化示例。这主要是为了能够可视化我们要从内容中过滤的内容:
{
"show": [
{
"name": "Game of Thrones",
"season": [
{
"_id": 1,
"episodes": [
{
"_id": 1,
"metaTags": [
"Arya Stark"
]
},
{
"_id": 2,
"metaTags": [
"John Snow"
]
}
]
},
{
"_id": 2,
"episodes": [
{
"_id": 1,
"metaTags": [
"Arya Stark"
]
}
]
}
]
},
{
"name": "Seinfeld",
"season": [
{
"_id": 1,
"episodes": [
{
"_id": 1,
"metaTags": [
"Jerry Seinfeld"
]
}
]
}
]
}
]
}
这里有两种获得结果的方法。首先,有一种使用的传统方法,用于处理数组,然后使用和条件表达式对数组进行过滤,当然还有几个操作阶段,以便重建数组:
db.sample.aggregate([
{ "$match": {
"show.season.episodes.metaTags": "Arya Stark"
}},
{ "$unwind": "$show" },
{ "$unwind": "$show.season" },
{ "$unwind": "$show.season.episodes" },
{ "$unwind": "$show.season.episodes.metaTags" },
{ "$group": {
"_id": {
"_id": "$_id",
"show": {
"name": "$show.name",
"season": {
"_id": "$show.season._id",
"episodes": {
"_id": "$show.season.episodes._id",
}
}
}
},
"metaTags": { "$push": "$show.season.episodes.metaTags" },
"matched": {
"$sum": {
"$cond": [
{ "$eq": [ "$show.season.episodes.metaTags", "Arya Stark" ] },
1,
0
]
}
}
}},
{ "$sort": { "_id._id": 1, "_id.show.season.episodes._id": 1 } },
{ "$group": {
"_id": {
"_id": "$_id._id",
"show": {
"name": "$_id.show.name",
"season": {
"_id": "$_id.show.season._id",
},
}
},
"episodes": {
"$push": {
"$cond": [
{ "$gt": [ "$matched", 0 ] },
{
"_id": "$_id.show.season.episodes._id",
"metaTags": "$metaTags"
},
false
]
}
}
}},
{ "$unwind": "$episodes" },
{ "$match": { "episodes": { "$ne": false } } },
{ "$group": {
"_id": "$_id",
"episodes": { "$push": "$episodes" }
}},
{ "$sort": { "_id._id": 1, "_id.show.season._id": 1 } },
{ "$group": {
"_id": {
"_id": "$_id._id",
"show": {
"name": "$_id.show.name"
}
},
"season": {
"$push": {
"_id": "$_id.show.season._id",
"episodes": "$episodes"
}
}
}},
{ "$group": {
"_id": "$_id._id",
"show": {
"$push": {
"name": "$_id.show.name",
"season": "$season"
}
}
}}
])
这一切都很好,也很容易理解。然而,在这里使用$unwind
的过程会产生大量开销,特别是当我们只讨论在文档本身中进行过滤,而不在文档之间进行任何分组时
有一种现代的方法可以解决这一问题,但需要注意的是,虽然效率很高,但它绝对是一个“怪物”,在处理嵌入式阵列时很容易迷失在逻辑中:
db.sample.aggregate([
{ "$match": {
"show.season.episodes.metaTags": "Arya Stark"
}},
{ "$project": {
"show": {
"$setDifference": [
{ "$map": {
"input": "$show",
"as": "show",
"in": {
"$let": {
"vars": {
"season": {
"$setDifference": [
{ "$map": {
"input": "$$show.season",
"as": "season",
"in": {
"$let": {
"vars": {
"episodes": {
"$setDifference": [
{ "$map": {
"input": "$$season.episodes",
"as": "episode",
"in": {
"$cond": [
{ "$setIsSubset": [
"$$episode.metaTags",
["Arya Stark"]
]},
"$$episode",
false
]
}
}},
[false]
]
}
},
"in": {
"$cond": [
{ "$ne": [ "$$episodes", [] ] },
{
"_id": "$$season._id",
"episodes": "$$episodes"
},
false
]
}
}
}
}},
[false]
]
}
},
"in": {
"$cond": [
{ "$ne": ["$$season", [] ] },
{
"name": "$$show.name",
"season": "$$season"
},
false
]
}
}
}
}},
[false]
]
}
}}
])
其中有很多数组处理,每个级别和每个数组的变量声明,因为我们都是通过“过滤”内容并测试空数组
在初始查询匹配之后使用单个管道,这比前面的过程快得多
两者产生相同的过滤结果:
{
"_id" : ObjectId("55b3455e64518e494632fa16"),
"show" : [
{
"name" : "Game of Thrones",
"season" : [
{
"_id" : 1,
"episodes" : [
{
"_id" : 1,
"metaTags" : [
"Arya Stark"
]
}
]
},
{
"_id" : 2,
"episodes" : [
{
"_id" : 1,
"metaTags" : [
"Arya Stark"
]
}
]
}
]
}
]
}
所有“秀”、“季”和“集”数组都完全过滤掉与内部“元标记”条件不匹配的任何文档。“metaTags”数组本身未被触动,并且仅通过测试匹配,实际上只有在测试之后,才能过滤不匹配的“剧集”数组内容
Java驱动程序是一个相当严格的过程,因为这只是一个表示对象和列表的数据结构。在同一个wat中,您只需使用标准列表和对象在Java中构建相同的结构。但基本上都是列表和映射语法:
MongoDatabase db = mongoClient.getDatabase("test");
MongoCollection<Document> collection = db.getCollection("sample");
String searchString = new String("Arya Stark");
List<Document> pipeline = Arrays.<Document>asList(
new Document("$match",
new Document("show.season.episodes.metaTags",searchString)
),
new Document("$project",
new Document("show",
new Document("$setDifference",
Arrays.<Object>asList(
new Document("$map",
new Document("input","$show")
.append("as","show")
.append("in",
new Document("$let",
new Document("vars",
new Document("season",
new Document("$setDifference",
Arrays.<Object>asList(
new Document("$map",
new Document("input","$$show.season")
.append("as","season")
.append("in",
new Document("$let",
new Document("vars",
new Document("episodes",
new Document("$setDifference",
Arrays.<Object>asList(
new Document("$map",
new Document("input","$$season.episodes")
.append("as","episode")
.append("in",
new Document("$cond",
Arrays.<Object>asList(
new Document("$setIsSubset",
Arrays.<Object>asList(
"$$episode.metaTags",
Arrays.<Object>asList(searchString)
)
),
"$$episode",
false
)
)
)
),
Arrays.<Object>asList(false)
)
)
)
)
.append("in",
new Document("$cond",
Arrays.<Object>asList(
new Document("$ne",
Arrays.<Object>asList(
"$$episodes",
Arrays.<Object>asList()
)
),
new Document("_id","$$season._id")
.append("episodes","$$episodes"),
false
)
)
)
)
)
),
Arrays.<Object>asList(false)
)
)
)
)
.append("in",
new Document("$cond",
Arrays.<Object>asList(
new Document("$ne",
Arrays.<Object>asList(
"$$season",
Arrays.<Object>asList()
)
),
new Document("name","$$show.name")
.append("season","$$season"),
false
)
)
)
)
)
),
Arrays.<Object>asList(false)
)
)
)
)
);
System.out.println(JSON.serialize(pipeline));
AggregateIterable<Document> result = collection.aggregate(pipeline);
MongoCursor<Document> cursor = result.iterator();
while (cursor.hasNext()) {
Document doc = cursor.next();
System.out.println(doc.toJson());
}
MongoDatabase db=mongoClient.getDatabase(“测试”);
MongoCollection collection=db.getCollection(“样本”);
字符串搜索字符串=新字符串(“Arya Stark”);
List pipeline=Arrays.asList(
新文档(“$match”,
新文档(“show.season.Sessions.metaTags”,searchString)
),
新文档(“$project”,
新文件(“显示”,
新文档(“$setDifference”,
Arrays.asList(
新文档(“$map”,
新文档(“输入”,“显示”)
追加
MongoDatabase db = mongoClient.getDatabase("test");
MongoCollection<Document> collection = db.getCollection("sample");
String searchString = new String("Arya Stark");
List<Document> pipeline = Arrays.<Document>asList(
new Document("$match",
new Document("show.season.episodes.metaTags",searchString)
),
new Document("$project",
new Document("show",
new Document("$setDifference",
Arrays.<Object>asList(
new Document("$map",
new Document("input","$show")
.append("as","show")
.append("in",
new Document("$let",
new Document("vars",
new Document("season",
new Document("$setDifference",
Arrays.<Object>asList(
new Document("$map",
new Document("input","$$show.season")
.append("as","season")
.append("in",
new Document("$let",
new Document("vars",
new Document("episodes",
new Document("$setDifference",
Arrays.<Object>asList(
new Document("$map",
new Document("input","$$season.episodes")
.append("as","episode")
.append("in",
new Document("$cond",
Arrays.<Object>asList(
new Document("$setIsSubset",
Arrays.<Object>asList(
"$$episode.metaTags",
Arrays.<Object>asList(searchString)
)
),
"$$episode",
false
)
)
)
),
Arrays.<Object>asList(false)
)
)
)
)
.append("in",
new Document("$cond",
Arrays.<Object>asList(
new Document("$ne",
Arrays.<Object>asList(
"$$episodes",
Arrays.<Object>asList()
)
),
new Document("_id","$$season._id")
.append("episodes","$$episodes"),
false
)
)
)
)
)
),
Arrays.<Object>asList(false)
)
)
)
)
.append("in",
new Document("$cond",
Arrays.<Object>asList(
new Document("$ne",
Arrays.<Object>asList(
"$$season",
Arrays.<Object>asList()
)
),
new Document("name","$$show.name")
.append("season","$$season"),
false
)
)
)
)
)
),
Arrays.<Object>asList(false)
)
)
)
)
);
System.out.println(JSON.serialize(pipeline));
AggregateIterable<Document> result = collection.aggregate(pipeline);
MongoCursor<Document> cursor = result.iterator();
while (cursor.hasNext()) {
Document doc = cursor.next();
System.out.println(doc.toJson());
}