<img src="//i.stack.imgur.com/RUiNP.png" height="16" width="18" alt="" class="sponsor tag img">elasticsearch Elasticsearch:不正确的分数_<img Src="//i.stack.imgur.com/RUiNP.png" Height="16" Width="18" Alt="" Class="sponsor Tag Img">elasticsearch

elasticsearch Elasticsearch:不正确的分数

elasticsearch Elasticsearch:不正确的分数,elasticsearch,elasticsearch,搜索几个单词的行为不太清楚。我正在运行以下查询： "query": { "bool": { "must": { "multi_match": { "query":"Testcat y", "type": "cross_fields", "fields": fi

搜索几个单词的行为不太清楚。我正在运行以下查询：

"query": {
            "bool": {
                "must": {
                    "multi_match": {
                        "query":"Testcat y",
                        "type": "cross_fields",
                        "fields": fields
                    }
                },
            }
        }

但搜索结果对我来说很奇怪：

{
        "score": 7.925287,
        "text": "Yappies",
    },
    {
        "score": 7.925287,
        "text": "YourPetBuddy",
    },
    {
        "score": 7.925287,
        "text": "YourDog",
    },

    {
        "score": 6.270683,
        "text": "Testcat",
    },

我使用以下设置：

BASE_SETTINGS = {
    'settings': {
        "number_of_shards": 1,
        "number_of_replicas": 0,
        'analysis': {

            'filter': {
                'autocomplete_filter': {
                    'type': 'edge_ngram',
                    'min_gram': 1,
                    'max_gram': 16
                }
            },

            'analyzer': {
                'autocomplete': {
                    'type': 'custom',
                    'tokenizer': "standard",
                    'filter': [
                        'lowercase',
                        'autocomplete_filter'
                    ]
                }
            }
        }
    }
}

测试猫不应该有更高的分数吗？因为它与搜索字符串有更大的匹配

Upd：对于搜索，我已经使用了标准搜索分析器

'properties': {
                        field['name']: {
                            'type': 'text',
                            'analyzer': 'autocomplete',
                            'search_analyzer': 'standard'
                        } for field in MAPPING_FIELDS[index]['fields']
                    }

如何将最高分数分配给最大前缀？

我假设您的索引过程和搜索查询使用相同的分析器。通过索引，它为每个文档的该字段创建16克：

Testcat: t te tes ...
Yappies: y ya yap...

如果在搜索过程中未使用不同的分析器，则对每个标记的查询也会发生相同的情况（由于标准标记器，在空间上分割）：

由于大量的标记，您的许多文档都会被击中，但我在这里猜测，

在索引中是该索引中该字段的唯一标记，因此，这些文档更相关

在搜索过程中，尝试使用不包含

edge\ngram

过滤器的其他分析器，如下所述：

同样适用：在搜索中

是一个标记，在文档中

也是一个标记（因为

edge\ngram

），并且

在索引中非常独特。如果您要搜索

testcast yyy

，您的结果可能会比预期的更多。请尝试使用？explain=true参数运行搜索查询。您将获得正在进行的计分的详细明细<代码>获取索引/_搜索？解释=true

Testcat: t te tes ...
y: y