Python Elasticsearch全文自动完成_Python_<img Src="//i.stack.imgur.com/RUiNP.png" Height="16" Width="18" Alt="" Class="sponsor Tag Img">elasticsearch_Python Requests

Python Elasticsearch全文自动完成

python

Python Elasticsearch全文自动完成,python,elasticsearch,python-requests,Python,elasticsearch,Python Requests,我在pythonrequests库中使用Elasticsearch。我将分析仪设置为： "analysis" : { "analyzer": { "my_basic_search": { "type": "standard", "stopwords": [] }, "my_autocomplete": { "typ

我在python

requests

库中使用Elasticsearch。我将分析仪设置为：

"analysis" : {
        "analyzer": {
            "my_basic_search": {
                "type": "standard",
                "stopwords": []
            },
            "my_autocomplete": {
                "type": "custom",
                "tokenizer": "keyword",
                "filter": ["lowercase", "autocomplete"]
            }
        },
        "filter": {
            "autocomplete": {
                "type": "edge_ngram",
                "min_gram": 1,
                "max_gram": 20,
            }
        }
    }

我有一个我想使用autocomplete搜索的艺术家列表：我当前的测试用例是“bill w”，它应该与“bill withers”等匹配，-

artist

映射如下所示（这是

GET的输出http://localhost:9200/my_index/artist/_mapping

）：

…然后我运行此查询来执行自动完成：

"query": {
        "function_score": {
            "query": {
                "bool": {
                    "must" : { "match": { "clean_artist_name.autocomplete": "bill w" } },
                    "should" : { "match": { "clean_artist_name": "bill w" } },
                }
            },
            "functions": [
            {
                "script_score": {
                    "script": "artist-score"
                }
            }
            ]
        }
    }

这似乎与包含“bill”或“w”以及“bill withers”的艺术家匹配：我只想匹配包含该字符串的艺术家。分析仪似乎工作正常，这是

http://localhost:9200/my_index/_analyze?analyzer=my_autocomplete&text=bill%20w

：

{
  "tokens" : [ {
    "token" : "b",
    "start_offset" : 0,
    "end_offset" : 6,
    "type" : "word",
    "position" : 1
  }, {
    "token" : "bi",
    "start_offset" : 0,
    "end_offset" : 6,
    "type" : "word",
    "position" : 1
  }, {
    "token" : "bil",
    "start_offset" : 0,
    "end_offset" : 6,
    "type" : "word",
    "position" : 1
  }, {
    "token" : "bill",
    "start_offset" : 0,
    "end_offset" : 6,
    "type" : "word",
    "position" : 1
  }, {
    "token" : "bill ",
    "start_offset" : 0,
    "end_offset" : 6,
    "type" : "word",
    "position" : 1
  }, {
    "token" : "bill w",
    "start_offset" : 0,
    "end_offset" : 6,
    "type" : "word",
    "position" : 1
  } ]
}

那么，为什么不排除只包含“bill”或“w”的匹配项呢？在我的查询中是否存在只允许与

my_basic\u search

分析器匹配的结果？

我认为您需要一个“术语”过滤器，而不是“必须”过滤器的“匹配”过滤器。您已经在ngrams中拆分了艺术家姓名，因此您的搜索文本应该与其中一个ngrams完全匹配。为此，您需要一个与ngrams完全匹配的“术语”：

"query": {
    "function_score": {
        "query": {
            "bool": {
                "must" : { "term": { "clean_artist_name.autocomplete": "bill w" } },
                "should" : { "match": { "clean_artist_name": "bill w" } },
            }
        },
        "functions": [
        {
            "script_score": {
                "script": "artist-score"
            }
        }
        ]
    }
}

谢谢成功了。此外，您知道如何更改分析器以删除停止字，但不按空格分割吗？例如，“臭名昭著的b”可以与“臭名昭著的大人物”相提并论。似乎没有那么简单的方法可以做到这一点。

"query": {
    "function_score": {
        "query": {
            "bool": {
                "must" : { "term": { "clean_artist_name.autocomplete": "bill w" } },
                "should" : { "match": { "clean_artist_name": "bill w" } },
            }
        },
        "functions": [
        {
            "script_score": {
                "script": "artist-score"
            }
        }
        ]
    }
}