Python Elasticsearch全文自动完成
我在pythonPython Elasticsearch全文自动完成,python,
elasticsearch,python-requests,Python,
elasticsearch,Python Requests,我在pythonrequests库中使用Elasticsearch。我将分析仪设置为: "analysis" : { "analyzer": { "my_basic_search": { "type": "standard", "stopwords": [] }, "my_autocomplete": { "typ
requests
库中使用Elasticsearch。我将分析仪设置为:
"analysis" : {
"analyzer": {
"my_basic_search": {
"type": "standard",
"stopwords": []
},
"my_autocomplete": {
"type": "custom",
"tokenizer": "keyword",
"filter": ["lowercase", "autocomplete"]
}
},
"filter": {
"autocomplete": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 20,
}
}
}
我有一个我想使用autocomplete搜索的艺术家列表:我当前的测试用例是“bill w”,它应该与“bill withers”等匹配,-artist
映射如下所示(这是GET的输出http://localhost:9200/my_index/artist/_mapping
):
…然后我运行此查询来执行自动完成:
"query": {
"function_score": {
"query": {
"bool": {
"must" : { "match": { "clean_artist_name.autocomplete": "bill w" } },
"should" : { "match": { "clean_artist_name": "bill w" } },
}
},
"functions": [
{
"script_score": {
"script": "artist-score"
}
}
]
}
}
这似乎与包含“bill”或“w”以及“bill withers”的艺术家匹配:我只想匹配包含该字符串的艺术家。分析仪似乎工作正常,这是http://localhost:9200/my_index/_analyze?analyzer=my_autocomplete&text=bill%20w
:
{
"tokens" : [ {
"token" : "b",
"start_offset" : 0,
"end_offset" : 6,
"type" : "word",
"position" : 1
}, {
"token" : "bi",
"start_offset" : 0,
"end_offset" : 6,
"type" : "word",
"position" : 1
}, {
"token" : "bil",
"start_offset" : 0,
"end_offset" : 6,
"type" : "word",
"position" : 1
}, {
"token" : "bill",
"start_offset" : 0,
"end_offset" : 6,
"type" : "word",
"position" : 1
}, {
"token" : "bill ",
"start_offset" : 0,
"end_offset" : 6,
"type" : "word",
"position" : 1
}, {
"token" : "bill w",
"start_offset" : 0,
"end_offset" : 6,
"type" : "word",
"position" : 1
} ]
}
那么,为什么不排除只包含“bill”或“w”的匹配项呢?在我的查询中是否存在只允许与my_basic\u search
分析器匹配的结果?我认为您需要一个“术语”过滤器,而不是“必须”过滤器的“匹配”过滤器。您已经在ngrams中拆分了艺术家姓名,因此您的搜索文本应该与其中一个ngrams完全匹配。为此,您需要一个与ngrams完全匹配的“术语”:
"query": {
"function_score": {
"query": {
"bool": {
"must" : { "term": { "clean_artist_name.autocomplete": "bill w" } },
"should" : { "match": { "clean_artist_name": "bill w" } },
}
},
"functions": [
{
"script_score": {
"script": "artist-score"
}
}
]
}
}
谢谢成功了。此外,您知道如何更改分析器以删除停止字,但不按空格分割吗?例如,“臭名昭著的b”可以与“臭名昭著的大人物”相提并论。似乎没有那么简单的方法可以做到这一点。
"query": {
"function_score": {
"query": {
"bool": {
"must" : { "term": { "clean_artist_name.autocomplete": "bill w" } },
"should" : { "match": { "clean_artist_name": "bill w" } },
}
},
"functions": [
{
"script_score": {
"script": "artist-score"
}
}
]
}
}