<img src="//i.stack.imgur.com/RUiNP.png" height="16" width="18" alt="" class="sponsor tag img">elasticsearch 当使用lucene模糊算子时，Elasticsearch查询和过滤器给出不同的单据计数_<img Src="//i.stack.imgur.com/RUiNP.png" Height="16" Width="18" Alt="" Class="sponsor Tag Img">elasticsearch_Lucene

elasticsearch 当使用lucene模糊算子时，Elasticsearch查询和过滤器给出不同的单据计数

lucene

elasticsearch 当使用lucene模糊算子时，Elasticsearch查询和过滤器给出不同的单据计数,elasticsearch,lucene,elasticsearch,Lucene,使用ElasticSearchV1.7.2和一个相当大的索引，我得到了以下两个搜索的不同文档数，这两个搜索在查询字符串中使用模糊搜索查询： { "query": { "query_string": { "query": "rapt~4" } } } curl -XPOST "http://localhost:9200/index-name/example-type/_search" -H "Content-Type: application/jso

使用ElasticSearchV1.7.2和一个相当大的索引，我得到了以下两个搜索的不同文档数，这两个搜索在查询字符串中使用模糊搜索

查询：

{
  "query": {
     "query_string": {
        "query": "rapt~4"
     }
  }
}

curl -XPOST "http://localhost:9200/index-name/example-type/_search" -H "Content-Type: application/json" -d'{"query":{"query_string":{"query":"rapt~"}},"explain":true}'

过滤器：

{
 "filter": {
    "query": {
       "query_string": {
          "query": "rapt~4"
       }
    }
 }
}

curl -XPOST "http://localhost:9200/index-name/example-type/_search" -H "Content-Type: application/json" -d'{"query":{"filtered":{"filter":{"fquery":{"query":{"query_string":{"query":"rapt~"}}}}}},"explain":true}'

过滤器提供的结果比查询多出约5%。为什么文档计数会有所不同？是否有我可以指定的选项使它们一致

请注意，只有在使用中等大小的数据集时，才会出现这种不一致性。我尝试在Elasticsearch 5.x之前的Elasticsearch版本中插入一些（），在顶层显示a的
过滤器
。Post过滤器通常仅在使用聚合时才相关
从Elasticsearch 5.0（及更高版本）开始，您必须明确地说
post\u filter
，以避免这种混淆
因此，不同之处在于，top查询实际上是将结果限制在一组匹配的文档中。post筛选器有效地匹配所有内容，然后仅从点击中删除结果，而不影响计数
…查询分数似乎是使用
查询总是计算分数，它们旨在帮助根据项目的相关性（分数）对项目进行正确排序。筛选器从不计算分数；筛选器用于纯布尔逻辑，不影响包含/排除之外的“相关性”
公平地说，您可以在Elasticsearch 1.x中以多种方式将任何查询转换为筛选器（在2.x中，所有查询在正确的上下文中也是筛选器！），但我倾向于使用
fquery
。如果您这样做，那么您应该会得到相同的结果：
作为查询：

{ "query": { "query_string": { "query": "rapt~" } } }
作为过滤器：

{ "query": { "filtered": { "filter": { "fquery": { "query": { "query_string": { "query": "rapt~" } } } } } } }
在ES 2.x中，过滤器也简化了（查询不变）：

为什么要在
contant\u score
中将其作为筛选器？您是否尝试过在那里使用查询？这一点很好。在不使用筛选器的情况下使用常量\u score查询会得到与简单查询相同的结果。造成差异的是筛选器位。下面给出了与上述常量\u score查询相同的结果：{“筛选器”：{“查询”：{“查询字符串”：{“查询”：“rapt~”}}我已更新了我的问题。谢谢！您正在查询已分析的_all字段。使用查询时，您的查询将通过分析器传递。使用筛选器时，它可能不是或以稍微不同的方式传递。请尝试添加explain:true或使用elasticsearch中的其他调试功能之一。您是否也可以添加您是我们的请求ing，包括
cURL
命令？请随意混淆索引/类型名称。这很奇怪，因为您有
org.elasticsearch.index.search.nested.NonNestedDocsFilter
，这意味着在您的筛选版本中发生了一些奇怪的事情。我的筛选表达式是
“description”：“ConstantScore（缓存（queryrapperfilter（_all:titl~2）））的乘积：“
没有其他内容，这意味着您的筛选器在进入ES之前正在进行某种修改。不幸的是，这并不完全正确。筛选器确实会影响计数。使用此筛选器：{“查询”：{“筛选”：{“筛选”：{“筛选”：{“筛选”：{“fquery”：{“查询”：{“查询字符串”：{“查询”：“rapt~”}}}}}我得到的结果与我的筛选器完全相同：{“筛选器”：{“查询”：{“查询字符串”：{“查询”：“rapt~”}}}}同样，这大约多了5%（我的数据集要大得多）而不是查询。这就是发送的整个请求？或者还有其他内容吗？这涉及到任何别名还是普通索引？是的，这就是整个请求。我正在访问的索引确实有别名，但当我直接针对单个索引时，差异仍然会出现。您可以添加两个单独响应的示例吗对你的问题有何评论？刚刚补充，希望有足够的细节有用！
curl -XPOST "http://localhost:9200/index-name/example-type/_search" -H "Content-Type: application/json" -d'{"query":{"constant_score":{"filter":{"query":{"query_string":{"query":"rapt~"}}}}},"explain":true}'

{ "_source": { "type": "example", "content": "to the fact that" }, "_explanation": { "value": 1, "description": "ConstantScore(QueryWrapperFilter(_all:rapt~2)), product of:", "details": [ { "value": 1, "description": "boost" }, { "value": 1, "description": "queryNorm" } ] } }

{ "explain": true, "query": { "bool": { "must_not": [ { "query_string": { "query": "rap~", "fields": [ "body" ] } } ], "must": [ { "constant_score": { "filter": { "query": { "query_string": { "query": "rap~", "fields": [ "body" ] } } } } } ] } } }

{ "query": { "query_string": { "query": "rapt~" } } }

{ "query": { "filtered": { "filter": { "fquery": { "query": { "query_string": { "query": "rapt~" } } } } } } }

{ "query": { "bool": { "filter": { "query_string": { "query": "rapt~" } } } } }