如何从Java中的聚合桶中获取elasticsearch聚合查询结果的值
因此,我已经能够使用elasticsearch高级restclient在Java中复制所需的elasticsearch查询。问题是我无法检索到我想要的值。在我给出代码之前,我想解决一个首要目标,以防有一个更简单的解决方案(似乎这不应该如此困难) 总体目标:对于“推荐人”字段中的每个唯一值,获取“已访问”的文档数==true 我的当前状态:我已经能够在kibana/elasticsearch中编写具有所需输出的查询,但当我在Java中复制此查询时,我无法访问所需的数据。(已通过searchRequest.source().toString()验证) 以下是查询:如何从Java中的聚合桶中获取elasticsearch聚合查询结果的值,java,
elasticsearch,elasticsearch-high-level-restclient,Java,
elasticsearch,Elasticsearch High Level Restclient,因此,我已经能够使用elasticsearch高级restclient在Java中复制所需的elasticsearch查询。问题是我无法检索到我想要的值。在我给出代码之前,我想解决一个首要目标,以防有一个更简单的解决方案(似乎这不应该如此困难) 总体目标:对于“推荐人”字段中的每个唯一值,获取“已访问”的文档数==true 我的当前状态:我已经能够在kibana/elasticsearch中编写具有所需输出的查询,但当我在Java中复制此查询时,我无法访问所需的数据。(已通过searchRequ
{
"aggs":{
"recommenderIDs": {
"terms": {
"field": "recommender"
},
"aggs": {
"visit_stats": {
"filters": {
"filters": {
"visited": {
"match":{
"visited": true
}
}
}
}
}
}
}
}
}
这就是我的java代码中的内容:
// ...
SearchRequest searchRequest = new SearchRequest(INDEX_REC_RECOMMENDATIONS);
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
String aggregationName = "recommenderId";
String filterName = "wasVisited";
TermsAggregationBuilder aggQuery = AggregationBuilders
.terms(aggregationName)
.field(RecommendationRepoFieldNames.RECOMMENDER);
AggregationBuilder aggFilters = AggregationBuilders.filters(
filterName,
new FiltersAggregator.KeyedFilter(
RecommendationRepoFieldNames.RECOMMENDER,
QueryBuilders.termQuery(RecommendationRepoFieldNames.VISITED, true))
);
aggQuery.subAggregation(aggFilters);
searchSourceBuilder.aggregation(aggQuery);
searchRequest.source(searchSourceBuilder);
// System.out.println(searchRequest.source().toString());
try {
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
Aggregations aggregations = searchResponse.getAggregations();
Terms byRecommenderId = aggregations.get(aggregationName);
Filters filterResponses = searchResponse.getAggregations().get(aggregationName);
// for (Filters.Bucket entry : filterResponses.getBuckets()) {
// String key = entry.getKeyAsString();
// }
for (Terms.Bucket bucket : byRecommenderId.getBuckets()) {
String bucketKey = bucket.getKeyAsString();
long totalDocs = bucket.getDocCount();
Aggregation visitedDocs = bucket.getAggregations().get(filterName);
//long visitedDocsCount = visitedDocs.getValue();
System.out.println();
}
} catch (IOException e) { //...
我整天都在摆弄这个,没有任何进展。尤其令人沮丧的是,当我在IDE中调试时,我可以看到每个推荐程序bucket的文档计数,但我不知道如何访问它。我意识到大约有180个类扩展了聚合,我尝试了一些,但每次都失败了
此外,如果您知道elasticsearch java高级rest客户端的任何合适资源,请告诉我。谢谢大家!
---------编辑5/4/21-------------
elasticsearch的输出示例:
// searchResponse (documents returned have been truncated to show only part we are interested in)
"aggregations": {
"sterms#recommenderId": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "AdjacentActivityRecommender",
"doc_count": 3,
"filters#wasVisited": {
"buckets": {
"recommender": {
"doc_count": 2
}
}
}
},
{
"key": "DefaultProfileDBRecommender",
"doc_count": 2,
"filters#wasVisited": {
"buckets": {
"recommender": {
"doc_count": 2
}
}
}
},
{
"key": "PSTR_SC_DI",
"doc_count": 2,
"filters#wasVisited": {
"buckets": {
"recommender": {
"doc_count": 1
}
}
}
},
{
"key": "SignificantCategories",
"doc_count": 2,
"filters#wasVisited": {
"buckets": {
"recommender": {
"doc_count": 2
}
}
}
}
]
}
}
然后将searchResponse.getAggregations()保存到聚合。最终,我们能够循环每个recommenderID的bucket,但我永远无法进入每个bucket中的聚合,这正是我需要做的。下面发布的解决方案代码:
try {
SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
Aggregations aggregations = searchResponse.getAggregations();
Terms byRecommenderId = aggregations.get(aggregationName);
for (Terms.Bucket bucket : byRecommenderId.getBuckets()) {
String recommenderId = bucket.getKeyAsString();
double totalDocs = bucket.getDocCount();
// next two lines are the solution:
Aggregations subAggregations = bucket.getAggregations();
Filters byWasVisited = subAggregations.get(filterName);
// always only one item from getBuckets()
double totalVisited = byWasVisited.getBuckets().get(0).getDocCount();
double percentVisited = totalVisited / totalDocs;
recommenderViews.put(recommenderId, percentVisited);
}
// ...
问题是我需要提取下一个内部级别的聚合(子聚合),这是通过再次调用getAggregations()完成的,这次是在循环内部。此时,我们只需从子集合中调用get(filterName)。您能否发布一个elasticsearch如何给出结果的示例?添加了示例elastic response。每个bucket的doc计数都是0,这在这个结果集中是实际需要的。根据我的经验,我会尝试将您的聚合结果转换为“MultiBucketsAggregation”。例如:((MultiBucketsAggregation)aggregations.get(“sterms#recommenderId”))。然后可以使用getBucket方法循环遍历每个bucket。在每个bucket中,您应该能够调用getAggregations(),这将为您提供所需的数据。这是我最好的猜测。谢谢你@VitorSantos!事实上,我几天前就发现了这一点,而且它确实起了作用。解决方案一直就在我的眼皮底下(啊!!)。我需要我的工作机器,我将于周一在这里为子孙后代发布解决方案。