<img src="//i.stack.imgur.com/RUiNP.png" height="16" width="18" alt="" class="sponsor tag img">elasticsearch ElasticSearch仅返回文档的特定部分_<img Src="//i.stack.imgur.com/RUiNP.png" Height="16" Width="18" Alt="" Class="sponsor Tag Img">elasticsearch_Lucene

elasticsearch ElasticSearch仅返回文档的特定部分

lucene

elasticsearch ElasticSearch仅返回文档的特定部分,elasticsearch,lucene,elasticsearch,Lucene,我有一个模仿以下结构的JSON文档 { "mydata": [ { "Key1": "Hello", "Key2": "this", "Key3": "is", "Key4": "line one", "Key5&qu

我有一个模仿以下结构的JSON文档

{
"mydata": [
      {
        "Key1": "Hello",
        "Key2": "this",
        "Key3": "is",
        "Key4": "line one",
        "Key5": "of the file"
      },
      {
        "Key1": "Hello",
        "Key2": "this",
        "Key3": "is",
        "Key4": "line two",
        "Key5": "of the file"
      }]
}

我使用的索引本身没有任何特定的映射。我能写一个像Lucene这样的自由文本查询

mydata.Key4:"line one"

结果返回整个文档。然而，在我的例子中，我只想检索JSON对象的第一部分作为结果。有没有办法做到这一点

{
        "Key1": "Hello",
        "Key2": "this",
        "Key3": "is",
        "Key4": "line one",
        "Key5": "of the file"
}

我发现我可以使用

\u source\u includes

检索特定字段并传递所需的键，但是，我无法找到一个等价项来返回JSON文档中与查询匹配的特定部分中的所有键。是因为文件的索引方式吗？有人能带我到这里吗

编辑：

{
  "_source": false,
  "query": {
    "nested": {
      "path": "data",
      "inner_hits": {        
      },
      "query": {
        "bool": {
          "must": [
            {
              "term": { //To look for exact match
                "data.Key4.keyword": "line one" //need to match line one not line two
              }
            }
          ]
        }
      }
    }
  }
}

我删除了索引并更新了映射，如下所示

{
"mappings" : {
     
  "properties" : {
   "data" : {
    "type" : "nested"
   }
  }
 }
}

我重新索引了文档，快速浏览了ES文档并运行了以下嵌套查询

{
"_source": false,
  "query": {
       "nested": {
          "path": "data",
          "query": {
          "match": { 
               "data.Key4": "line one" 
          }
       },
       "inner_hits": {} 
  }
 }
}

但是，这也会返回我索引中的所有文档，除了现在返回的结果位于

internal\u hits

{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 1,
            "relation": "eq"
        },
        "max_score": 0.52889514,
        "hits": [{
            "_index": "myindex",
            "_type": "_doc",
            "_id": "QAZJ-nMBi6fwNevjDQJy",
            "_score": 0.52889514,
            "inner_hits": {
                "data": {
                    "hits": {
                        "total": {
                            "value": 2,
                            "relation": "eq"
                        },
                        "max_score": 0.87546873,
                        "hits": [{
                            "_index": "myindex",
                            "_type": "_doc",
                            "_id": "QAZJ-nMBi6fwNevjDQJy",
                            "_nested": {
                                "field": "data",
                                "offset": 0
                            },
                            "_score": 0.87546873,
                            "_source": {
                                "Key1": "Hello",
                                "Key2": "this",
                                "Key3": "is",
                                "Key4": "line one",
                                "Key5": "of the file"
                            }
                        }, {
                            "_index": "myindex",
                            "_type": "_doc",
                            "_id": "QAZJ-nMBi6fwNevjDQJy",
                            "_nested": {
                                "field": "data",
                                "offset": 1
                            },
                            "_score": 0.18232156,
                            "_source": {
                                "Key1": "Hello",
                                "Key2": "this",
                                "Key3": "is",
                                "Key4": "line two",
                                "Key5": "of the file"
                            }
                        }]
                    }
                }
            }
        }]
    }
}

我在这里遗漏了什么吗？

因为您没有定义

映射，这是主要问题。当您以您提到的方式保存数据时，它将作为text
类型的单个属性保存
执行搜索时，它将显示整个文档。但是，如果为mydata
定义nested
映射，则可以使用internal\u hits
仅检索匹配的文档
编辑：
{
  "_source": false,
  "query": {
    "nested": {
      "path": "data",
      "inner_hits": {        
      },
      "query": {
        "bool": {
          "must": [
            {
              "term": { //To look for exact match
                "data.Key4.keyword": "line one" //need to match line one not line two
              }
            }
          ]
        }
      }
    }
  }
}

要使用的查询：
{
  "_source": false,
  "query": {
    "nested": {
      "path": "data",
      "inner_hits": {        
      },
      "query": {
        "bool": {
          "must": [
            {
              "term": { //To look for exact match
                "data.Key4.keyword": "line one" //need to match line one not line two
              }
            }
          ]
        }
      }
    }
  }
}

使用match:
{
  "_source": false,
  "query": {
    "nested": {
      "path": "data",
      "inner_hits": {        
      },
      "query": {
        "bool": {
          "must": [
            {
              "term": { //To look for exact match
                "data.Key4.keyword": "line one" //need to match line one not line two
              }
            }
          ]
        }
      }
    }
  }
}

第一行
将标记如下
{
    "tokens": [
        {
            "token": "line",
            "start_offset": 0,
            "end_offset": 4,
            "type": "<ALPHANUM>",
            "position": 0
        },
        {
            "token": "one",
            "start_offset": 5,
            "end_offset": 8,
            "type": "<ALPHANUM>",
            "position": 1
        }
    ]
}

{
“代币”：[
{
“令牌”：“行”，
“起始偏移量”：0，
“端部偏移”：4，
“类型”：“，
“位置”：0
},
{
“令牌”：“一个”，
“起始偏移量”：5，
“端部偏移”：8，
“类型”：“，
“职位”：1
}
]
}

类似地，它创建两个令牌行
，两个

因此，当您使用match
时，它是全文搜索查询。对索引时间和搜索时间进行了分析。因此，在搜索期间，第一行
将被分析，ES将查找行
或一行
<代码>第二行

包含标记

行

，因此这也是结果的一部分

为了避免这种情况，你必须避免分析。因此，必须使用术语查询。它看起来完全匹配。

我明白。但是，我看到在为嵌套查询提供的所有示例中，在运行查询时总是指定键。比如mydata.Key4：“第一行”。如果我将数据类型更改为嵌套，我是否仍然可以运行像“第一行”这样的自由文本查询并仅返回该文档？是的。您需要重新索引数据，因为映射需要修改，并且正如我提到的，您需要使用具有内部命中的嵌套查询。我尝试使用更新的映射并使用内部命中检索匹配的文档，但仍然没有成功。你能帮我一下吗？@FAlonso你现在得到的结果是正确的，你只能得到与你的查询匹配的嵌套内部命中。问题是将

第一行

与match

第一行

和

第二行

进行匹配，因为标记

行

匹配两个嵌套文档。您可能需要的是精确匹配，因此可以使用

术语

查询（而不是

匹配

），或者在

数据.Key4.关键字

字段上进行匹配。