<img src="//i.stack.imgur.com/RUiNP.png" height="16" width="18" alt="" class="sponsor tag img">elasticsearch 尝试为自动完成形成Elasticsearch查询_<img Src="//i.stack.imgur.com/RUiNP.png" Height="16" Width="18" Alt="" Class="sponsor Tag Img">elasticsearch_Autocomplete

elasticsearch 尝试为自动完成形成Elasticsearch查询

autocomplete

elasticsearch 尝试为自动完成形成Elasticsearch查询,elasticsearch,autocomplete,elasticsearch,Autocomplete,我读了很多书，似乎使用EdgeGrams是为搜索应用程序实现自动完成功能的好方法。我已经在我的设置中为索引配置了EdgeGrams PUT /bigtestindex { "settings":{ "analysis":{ "analyzer":{ "autocomplete":{ "type":"custom", "tokenizer":"standard", "filter":[ "stand

我读了很多书，似乎使用EdgeGrams是为搜索应用程序实现自动完成功能的好方法。我已经在我的设置中为索引配置了EdgeGrams

PUT /bigtestindex
{
  "settings":{
    "analysis":{
      "analyzer":{
        "autocomplete":{
          "type":"custom",
          "tokenizer":"standard",
          "filter":[ "standard", "stop", "kstem", "ngram" ] 
        }
      },
      "filter":{
        "edgengram":{
          "type":"ngram",
          "min_gram":2,
          "max_gram":15
        }
      },
      "highlight": {
      "pre_tags" : ["<em>"],
      "post_tags" : ["</em>"],
        "fields": {
          "title.autocomplete": {
            "number_of_fragments": 1,
            "fragment_size": 250
          }
        } 
      }
    }
  }
}

或者我需要使用多字段类型：

"title": {
        "type": "multi_field",
        "fields": {
          "title": {
            "type": "string"
          },
          "autocomplete": {
            "analyzer": "autocomplete",
            "type": "string",
            "index": "not_analyzed"
          }
        }
     },

我正在使用ES 1.4.1，希望使用标题字段进行自动完成

简短回答：您需要在字段映射中使用它。例如：

PUT /test_index
{
   "settings": {
      "analysis": {
         "analyzer": {
            "autocomplete": {
               "type": "custom",
               "tokenizer": "standard",
               "filter": [
                  "standard",
                  "stop",
                  "kstem",
                  "ngram"
               ]
            }
         },
         "filter": {
            "edgengram": {
               "type": "ngram",
               "min_gram": 2,
               "max_gram": 15
            }
         }
      }
   },
   "mappings": {
      "doc": {
         "properties": {
            "field1": {
               "type": "string",
               "index_analyzer": "autocomplete",
               "search_analyzer": "standard"
            }
         }
      }
   }
}

有关更多讨论，请参阅：

及

此外，我认为您不希望在索引定义中使用“

”突出显示“

部分；属于查询中的

编辑：在试用您的代码时，它有几个问题。一个是我已经提到的突出问题。另一种情况是，您将过滤器命名为“EdgeGram”，即使它的类型是

“ngram”

，而不是

“EdgeGram”

，但随后您在分析器中引用了过滤器

“ngram”

，该过滤器将使用，这可能无法满足您的需要。（提示：您可以使用了解分析器对文档所做的操作；不过，您可能希望在生产中关闭它们。）

所以你真正想要的可能是这样的：

"title": {
        "type": "string",
        "index_analyzer": "autocomplete",
        "search_analyzer": "standard"
      },

PUT /test_index
{
   "settings": {
      "analysis": {
         "analyzer": {
            "autocomplete": {
               "type": "custom",
               "tokenizer": "standard",
               "filter": [
                  "standard",
                  "stop",
                  "kstem",
                  "edgengram_filter"
               ]
            }
         },
         "filter": {
            "edgengram_filter": {
               "type": "edgeNGram",
               "min_gram": 2,
               "max_gram": 15
            }
         }
      }
   },
   "mappings": {
      "doc": {
         "properties": {
            "content": {
               "type": "string",
               "index_analyzer": "autocomplete",
               "search_analyzer": "standard"
            }
         }
      }
   }
}

当我为这两个文档编制索引时：

POST test_index/doc/_bulk
{"index":{"_id":1}}
{"content":"hello world"}
{"index":{"_id":2}}
{"content":"goodbye world"}

并运行此查询（您的

“突出显示”

块中也有错误；应该说

“字段”

，而不是

“字段”

）

POST/test\u index/doc/\u搜索
{
“查询”：{
“匹配”：{
“内容”：{
“查询”：“好工作”，
“操作员”：“和”
}
}
},
“亮点”：{
“pre_标签”：[
""
],
“post_标签”：[
""
],
“字段”：{
“内容”：{
“碎片的数量”：1，
“碎片大小”：250
}
}
}
}

如果我理解正确的话，我会得到这个回复，这似乎就是你想要的：

{
   "took": 5,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 1,
      "max_score": 0.2712221,
      "hits": [
         {
            "_index": "test_index",
            "_type": "doc",
            "_id": "2",
            "_score": 0.2712221,
            "_source": {
               "content": "goodbye world"
            },
            "highlight": {
               "content": [
                  "<em>goodbye</em> <em>world</em>"
               ]
            }
         }
      ]
   }
}

{
“take”：5，
“超时”：false，
“_碎片”：{
“总数”：5，
“成功”：5，
“失败”：0
},
“点击次数”：{
“总数”：1，
“最高分数”：0.2712221，
“点击次数”：[
{
“_索引”：“测试_索引”，
“_type”：“doc”，
“_id”：“2”，
“_分数”：0.2712221，
“_来源”：{
“内容”：“再见世界”
},
“亮点”：{
“内容”：[
“再见，世界”
]
}
}
]
}
}

下面是我用来测试它的一些代码：

是的-谢谢。这帮助我理解了我要做的事情！将自动完成映射到html标题或正文标记是否更好？请注意为用例上的自动完成附件创建的索引的大小。我可以看到任何一个参数，但我可能会使用标题。为什么使用“匹配”呢“查询”而不是“匹配短语查询”？我一直在阅读，自动完成的查询类型似乎是“匹配短语查询”-只是好奇。。。仍在学习ES-这太神奇了！

POST /test_index/doc/_search
{
   "query": {
      "match": {
         "content": {
            "query": "good wor",
            "operator": "and"
         }
      }
   },
   "highlight": {
      "pre_tags": [
         "<em>"
      ],
      "post_tags": [
         "</em>"
      ],
      "fields": {
         "content": {
            "number_of_fragments": 1,
            "fragment_size": 250
         }
      }
   }
}

{
   "took": 5,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 1,
      "max_score": 0.2712221,
      "hits": [
         {
            "_index": "test_index",
            "_type": "doc",
            "_id": "2",
            "_score": 0.2712221,
            "_source": {
               "content": "goodbye world"
            },
            "highlight": {
               "content": [
                  "<em>goodbye</em> <em>world</em>"
               ]
            }
         }
      ]
   }
}