elasticsearch 复杂弹性搜索查询
我在弹性搜索索引中有以下文档
elasticsearch 复杂弹性搜索查询,
elasticsearch,kibana,
elasticsearch,Kibana,我在弹性搜索索引中有以下文档 [{ "_index": "ten2", "_type": "documents", "_id": "c323c2244a4a4c22_en-us", "_source": { "publish_details": [{
[{
"_index": "ten2",
"_type": "documents",
"_id": "c323c2244a4a4c22_en-us",
"_source": {
"publish_details": [{
"environment": "603fe91adbdcff66",
"time": "2020-06-24T13:36:55.514Z",
"locale": "hi-in",
"user": "aadab2f531206e9d",
"version": 1
},
{
"environment": "603fe91adbdcff66",
"time": "2020-06-24T13:36:55.514Z",
"locale": "en-us",
"user": "aadab2f531206e9d",
"version": 1
}
],
"created_at": "2020-06-24T13:36:43.037Z",
"_in_progress": false,
"title": "Entry 1",
"locale": "en-us",
"url": "/entry-1",
"tags": [],
"uid": "c323c2244a4a4c22",
"updated_at": "2020-06-24T13:36:43.037Z",
"fields": []
}
},
{
"_index": "ten2",
"_type": "documents",
"_id": "c323c2244a4a4c22_mr-in",
"_source": {
"publish_details": [{
"environment": "603fe91adbdcff66",
"time": "2020-06-24T13:37:26.205Z",
"locale": "mr-in",
"user": "aadab2f531206e9d",
"version": 1
}],
"created_at": "2020-06-24T13:36:43.037Z",
"_in_progress": false,
"title": "Entry 1 marathi",
"locale": "mr-in",
"url": "/entry-1",
"tags": [],
"uid": "c323c2244a4a4c22",
"updated_at": "2020-06-24T13:37:20.092Z",
"fields": []
}
}
]
我希望这个结果为空。在这里,我们可以看到两个文档的uid是相同的。我正在使用以下查询获取结果:
{
"query": {
"bool": {
"must": [{
"bool": {
"must_not": [{
"bool": {
"must": [{
"nested": {
"path": "publish_details",
"query": {
"term": {
"publish_details.environment": "603fe91adbdcff66"
}
}
}
}, {
"nested": {
"path": "publish_details",
"query": {
"term": {
"publish_details.locale": "en-us"
}
}
}
}, {
"nested": {
"path": "publish_details",
"query": {
"term": {
"publish_details.locale": "hi-in"
}
}
}
}, {
"nested": {
"path": "publish_details",
"query": {
"term": {
"publish_details.locale": "mr-in"
}
}
}
}]
}
}]
}
}]
}
}
}
但是上面的查询给了我所有2个文档,但是我想要结果作为银行。这里的原因是uid是通用的,并且uid包含所有三个本地的发布细节。所以,获得有效结果的方法,就是在这里帮助我的任何聚合查询。这只是一个样本,我有很多文档要过滤掉。Kindle在这里帮助我
{
"aggs": {
"agg1": {
"terms": {
"field": "uid.raw"
},
"aggs": {
"agg2": {
"nested": {
"path": "publish_details"
},
"aggs": {
"locales": {
"terms": {
"field": "publish_details.locale"
}
}
}
}
}
}
}
}
此查询将首先按uid对您进行分组,然后发布\u details.locale
它提供了如下结果
"aggregations": {
"agg1": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "c323c2244a4a4c22",
"doc_count": 2,
"agg2": {
"doc_count": 3,
"locales": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "en-us",
"doc_count": 1
},
{
"key": "hi-in",
"doc_count": 1
},
{
"key": "mr-in",
"doc_count": 1
}
]
}
}
},
{
"key": "c323c2244rrffa4a4c22",
"doc_count": 1,
"agg2": {
"doc_count": 2,
"locales": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "en-us",
"doc_count": 1
},
{
"key": "hi-in",
"doc_count": 1
}
]
}
}
}
]
我有三个文档,其中两个具有相同的id,另一个不同
我将进一步更新查询,以删除第一个有3个bucket的结果。您还可以在代码中进一步处理它
你可以做到。10万份文件就行了。但是当你有数百万的资金时,你应该有足够的资源来执行这项任务
{
"size" : 0,
"query":{
"bool" :{
"must_not":{
"match":{
"publish_details.environment":"603fe91adbdcff66"
}
}
}
},
"aggs": {
"uids": {
"terms": {
"field": "uid.raw"
},
"aggs": {
"details": {
"nested": {
"path": "publish_details"
},
"aggs": {
"locales": {
"terms": {
"field": "publish_details.locale"
}
},
"unique_locales": {
"value_count": {
"field": "publish_details.locale"
}
}
}
}
}
}
}
}
结果:
"aggregations": {
"uids": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "c323c2244a4a4c22",
"doc_count": 2,
"details": {
"doc_count": 3,
"locales": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "en-us",
"doc_count": 1
},
{
"key": "hi-in",
"doc_count": 1
},
{
"key": "mr-in",
"doc_count": 1
}
]
},
"unique_locales": {
"value": 3
}
}
},
{
"key": "c323c2244rrffa4a4c22",
"doc_count": 1,
"details": {
"doc_count": 2,
"locales": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "en-us",
"doc_count": 1
},
{
"key": "hi-in",
"doc_count": 1
}
]
},
"unique_locales": {
"value": 2
}
}
}
]
请帮助我进入上述查询以获得有效结果。您的问题是unclear@Gibbs我已经共享了我的弹性搜索文档列表和查询。我想要一个空结果,但我的查询提供了所有文档。所以我想要这样一个查询,给我一个空白的结果。与publish_details.locale和publish_details.environment相关的查询。您说的是相同的uid并检查所有3个区域设置!?是的,两个文档的uid字段值相同。谢谢您的回复。但是我有超过10万个文档,需要对它们进行筛选。我也可以在这里使用publish_details.environment。表示需要忽略uid相同的文档并发布详细信息。环境为:603fe91adbdcff66,发布详细信息。地区为:hi in,en us,mr-in。此处是否需要聚合?或者是任何直接获得结果的方法,就像我在那里使用的查询一样。这里的原因是,使用查询进行聚合时,结果命中率和聚合率都会出现。我需要其他文档的全部详细信息。它不会提供命中率,因为我将大小设置为0。我不认为没有聚合的方法。因为它位于文档之间。谢谢,它不会给出命中率,但是文档中的其他字段呢。意思是说,假设我有3个文档,我只想要一个与其他字段(如标题、url等)一起响应,我们如何通过聚合获得这些字段?在一篇文章中,你会问很多问题。如果当前问题已解决,您可以接受并打开另一个问题。任何人都可以帮你。每个文档的标题、url都是唯一的/相同的/可以是任何内容?。我怀疑你需要点击,但你必须在这些字段上进行聚合。