elasticsearch 尝试为自动完成形成Elasticsearch查询
我读了很多书,似乎使用EdgeGrams是为搜索应用程序实现自动完成功能的好方法。我已经在我的设置中为索引配置了EdgeGramselasticsearch 尝试为自动完成形成Elasticsearch查询,elasticsearch,autocomplete,elasticsearch,Autocomplete,我读了很多书,似乎使用EdgeGrams是为搜索应用程序实现自动完成功能的好方法。我已经在我的设置中为索引配置了EdgeGrams PUT /bigtestindex { "settings":{ "analysis":{ "analyzer":{ "autocomplete":{ "type":"custom", "tokenizer":"standard", "filter":[ "stand
PUT /bigtestindex
{
"settings":{
"analysis":{
"analyzer":{
"autocomplete":{
"type":"custom",
"tokenizer":"standard",
"filter":[ "standard", "stop", "kstem", "ngram" ]
}
},
"filter":{
"edgengram":{
"type":"ngram",
"min_gram":2,
"max_gram":15
}
},
"highlight": {
"pre_tags" : ["<em>"],
"post_tags" : ["</em>"],
"fields": {
"title.autocomplete": {
"number_of_fragments": 1,
"fragment_size": 250
}
}
}
}
}
}
或者我需要使用多字段类型:
"title": {
"type": "multi_field",
"fields": {
"title": {
"type": "string"
},
"autocomplete": {
"analyzer": "autocomplete",
"type": "string",
"index": "not_analyzed"
}
}
},
我正在使用ES 1.4.1,希望使用标题字段进行自动完成 简短回答:您需要在字段映射中使用它。例如:
PUT /test_index
{
"settings": {
"analysis": {
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"standard",
"stop",
"kstem",
"ngram"
]
}
},
"filter": {
"edgengram": {
"type": "ngram",
"min_gram": 2,
"max_gram": 15
}
}
}
},
"mappings": {
"doc": {
"properties": {
"field1": {
"type": "string",
"index_analyzer": "autocomplete",
"search_analyzer": "standard"
}
}
}
}
}
有关更多讨论,请参阅:
及
此外,我认为您不希望在索引定义中使用“”突出显示“
部分;属于查询中的
编辑:在试用您的代码时,它有几个问题。一个是我已经提到的突出问题。另一种情况是,您将过滤器命名为“EdgeGram”,即使它的类型是“ngram”
,而不是“EdgeGram”
,但随后您在分析器中引用了过滤器“ngram”
,该过滤器将使用,这可能无法满足您的需要。(提示:您可以使用了解分析器对文档所做的操作;不过,您可能希望在生产中关闭它们。)
所以你真正想要的可能是这样的:
"title": {
"type": "string",
"index_analyzer": "autocomplete",
"search_analyzer": "standard"
},
PUT /test_index
{
"settings": {
"analysis": {
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"standard",
"stop",
"kstem",
"edgengram_filter"
]
}
},
"filter": {
"edgengram_filter": {
"type": "edgeNGram",
"min_gram": 2,
"max_gram": 15
}
}
}
},
"mappings": {
"doc": {
"properties": {
"content": {
"type": "string",
"index_analyzer": "autocomplete",
"search_analyzer": "standard"
}
}
}
}
}
当我为这两个文档编制索引时:
POST test_index/doc/_bulk
{"index":{"_id":1}}
{"content":"hello world"}
{"index":{"_id":2}}
{"content":"goodbye world"}
并运行此查询(您的“突出显示”
块中也有错误;应该说“字段”
,而不是“字段”
)
POST/test\u index/doc/\u搜索
{
“查询”:{
“匹配”:{
“内容”:{
“查询”:“好工作”,
“操作员”:“和”
}
}
},
“亮点”:{
“pre_标签”:[
""
],
“post_标签”:[
""
],
“字段”:{
“内容”:{
“碎片的数量”:1,
“碎片大小”:250
}
}
}
}
如果我理解正确的话,我会得到这个回复,这似乎就是你想要的:
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.2712221,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "2",
"_score": 0.2712221,
"_source": {
"content": "goodbye world"
},
"highlight": {
"content": [
"<em>goodbye</em> <em>world</em>"
]
}
}
]
}
}
{
“take”:5,
“超时”:false,
“_碎片”:{
“总数”:5,
“成功”:5,
“失败”:0
},
“点击次数”:{
“总数”:1,
“最高分数”:0.2712221,
“点击次数”:[
{
“_索引”:“测试_索引”,
“_type”:“doc”,
“_id”:“2”,
“_分数”:0.2712221,
“_来源”:{
“内容”:“再见世界”
},
“亮点”:{
“内容”:[
“再见,世界”
]
}
}
]
}
}
下面是我用来测试它的一些代码:
是的-谢谢。这帮助我理解了我要做的事情!将自动完成映射到html标题或正文标记是否更好?请注意为用例上的自动完成附件创建的索引的大小。我可以看到任何一个参数,但我可能会使用标题。为什么使用“匹配”呢“查询”而不是“匹配短语查询”?我一直在阅读,自动完成的查询类型似乎是“匹配短语查询”-只是好奇。。。仍在学习ES-这太神奇了!
POST /test_index/doc/_search
{
"query": {
"match": {
"content": {
"query": "good wor",
"operator": "and"
}
}
},
"highlight": {
"pre_tags": [
"<em>"
],
"post_tags": [
"</em>"
],
"fields": {
"content": {
"number_of_fragments": 1,
"fragment_size": 250
}
}
}
}
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.2712221,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "2",
"_score": 0.2712221,
"_source": {
"content": "goodbye world"
},
"highlight": {
"content": [
"<em>goodbye</em> <em>world</em>"
]
}
}
]
}
}