elasticsearch Elasticsearch post_过滤器聚合查询
我对所有API感兴趣,这些API甚至没有返回一个200响应(在特定的时间间隔内) 我基本上需要这个:
elasticsearch Elasticsearch post_过滤器聚合查询,
elasticsearch,
elasticsearch,我对所有API感兴趣,这些API甚至没有返回一个200响应(在特定的时间间隔内) 我基本上需要这个: select url from api_log except/minus select url from api_log where status='200' 翻译成ES,我正在尝试一种类似的方法: 首先计算总量 从随后的结果中,筛选出所有具有状态为200的子项的记录 ES样本数据 { "_index": "api_log", "_type":
select url from api_log
except/minus
select url from api_log where status='200'
翻译成ES,我正在尝试一种类似的方法:
{
"_index": "api_log",
"_type": "_doc",
"_id": "1",
"_version": 1,
"_score": 1,
"_source": {
"in_time": "2019-05-13T17:20:51.108945",
"out_time": "2019-05-13T17:20:51.145549",
"duration": 36.6041660308838,
"status": "200",
"url": "/api/myFirstAPI"
}
}
,
{
"_index": "api_log",
"_type": "_doc",
"_id": "2",
"_version": 1,
"_score": 1,
"_source": {
"in_time": "2019-05-13T17:20:57.915694",
"out_time": "2019-05-13T17:20:57.941989",
"duration": 26.2949466705322,
"status": "403",
"url": "/api/mySecondAPI"
}
},
{
"_index": "api_log",
"_type": "_doc",
"_id": "3",
"_version": 1,
"_score": 1,
"_source": {
"in_time": "2019-05-13T17:22:35.274372",
"out_time": "2019-05-13T17:22:35.288944",
"duration": 14.5719051361084,
"status": "400",
"url": "/api/myFirstAPI"
}
}
对于以上数据,我希望结果url为{'/api/mySecondAPI'}
仅使用AGG的请求/响应
POST /api_log/_search
{
"size": 0,
"aggs": {
"url": {
"terms": {
"field": "url.keyword"
},
"aggregations": {
"status": {
"terms": {
"field": "status.keyword"
}
}
}
}
}
}
对上述请求的响应
{
"took" : 880,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 10000,
"relation" : "gte"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"url" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 394668,
"buckets" : [
{
"key" : "/api/myFirstRequest",
"doc_count" : 1352845,
"status" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "200",
"doc_count" : 1187611
},
{
"key" : "302",
"doc_count" : 139932
},
{
"key" : "401",
"doc_count" : 22615
},
{
"key" : "500",
"doc_count" : 2250
},
{
"key" : "403",
"doc_count" : 437
}
]
}
},
...
...
...
从上面我需要过滤掉所有没有状态为“200”的子bucket的bucket(URL)
我已经走了这么远。看起来很近,但很远…似乎无法确定类型字段中应该包含什么
带过滤器的请求
POST /api_log/_search
{
"size": 0,
"aggs": {
"page_name": {
"terms": {
"field": "url.keyword"
},
"aggregations": {
"status": {
"terms": {
"field": "status.keyword"
}
}
}
}
},
"post_filter": {
"bool": {
"must_not": [
{
"has_child" : {
"type" : "?????",
"query" : {
"term" : {"status" : "200"}
}
}
}
]
}
}
}
示例输入(来自apache日志):
t1 /api/FirstAPI 200 <-- Eliminate First API completely
t2 /api/FirstAPI 400
t3 /api/FirstAPI 403
t4 /api/SecondAPI 403
t5 /api/SecondAPI 400
t6 /api/ThirdAPI 500
t7 /api/ThirdAPI 500
t8 /api/SecondAPI 200 <---Eliminate Second API completely
t9 /api/ThirdAPI 500
t10 /api/ThirdAPI 403
t1/api/FirstAPI 200如果我理解正确,您只想从聚合中排除200。我看不出在这里使用post\u过滤器的理由。您可以使用术语聚合
。这将统计所有200
响应,并将其添加到doc\u count
字段中,但将排除聚合响应中的桶,并且不会显示200
POST /api_log/_search
{
"size": 0,
"aggs": {
"url": {
"terms": {
"field": "url.keyword"
},
"aggregations": {
"status": {
"terms": {
"field": "status.keyword",
"exclude": "200"
}
}
}
}
}
}
备选方案:
t1 /api/FirstAPI 200 <-- Eliminate First API completely
t2 /api/FirstAPI 400
t3 /api/FirstAPI 403
t4 /api/SecondAPI 403
t5 /api/SecondAPI 400
t6 /api/ThirdAPI 500
t7 /api/ThirdAPI 500
t8 /api/SecondAPI 200 <---Eliminate Second API completely
t9 /api/ThirdAPI 500
t10 /api/ThirdAPI 403
根据您的输入,您似乎希望将200
作为结果集的一部分(因为您使用的是post_filter),但如果不是这样,这里有另一种方法。对查询响应进行聚合;因此,如果使用从结果集中排除200,则不会有任何状态为200的bucket
POST /api_log/_search
{
"size": 0,
"query": {
"bool": {
"must_not": [
{
"terms": {
"status": [
"200"
]
}
}
]
}
},
"aggs": {
"url": {
"terms": {
"field": "url.keyword"
},
"aggregations": {
"status": {
"terms": {
"field": "status.keyword"
}
}
}
}
}
}
如果我理解正确,您只想从聚合中排除200。我看不出在这里使用post\u过滤器的理由。您可以使用术语聚合
。这将统计所有200
响应,并将其添加到doc\u count
字段中,但将排除聚合响应中的桶,并且不会显示200
POST /api_log/_search
{
"size": 0,
"aggs": {
"url": {
"terms": {
"field": "url.keyword"
},
"aggregations": {
"status": {
"terms": {
"field": "status.keyword",
"exclude": "200"
}
}
}
}
}
}
备选方案:
t1 /api/FirstAPI 200 <-- Eliminate First API completely
t2 /api/FirstAPI 400
t3 /api/FirstAPI 403
t4 /api/SecondAPI 403
t5 /api/SecondAPI 400
t6 /api/ThirdAPI 500
t7 /api/ThirdAPI 500
t8 /api/SecondAPI 200 <---Eliminate Second API completely
t9 /api/ThirdAPI 500
t10 /api/ThirdAPI 403
根据您的输入,您似乎希望将200
作为结果集的一部分(因为您使用的是post_filter),但如果不是这样,这里有另一种方法。对查询响应进行聚合;因此,如果使用从结果集中排除200,则不会有任何状态为200的bucket
POST /api_log/_search
{
"size": 0,
"query": {
"bool": {
"must_not": [
{
"terms": {
"status": [
"200"
]
}
}
]
}
},
"aggs": {
"url": {
"terms": {
"field": "url.keyword"
},
"aggregations": {
"status": {
"terms": {
"field": "status.keyword"
}
}
}
}
}
}
你能添加一些示例api_日志数据和映射/模式吗?根据你的要求添加更多信息@theuknownc你能添加一些示例api_日志数据和映射/模式吗?根据你的要求添加更多信息@theuknown这不是我想要的。请看我的问题再次编辑。我已经添加了样本数据,以便更清楚地了解@theuknow,但这不是我想要的。请看我的问题再次编辑。我已经添加了样本数据,以便更清楚地了解@TheUknown