elasticsearch 在弹性搜索中使用带有关键字数据类型的normalizer,得到意外结果,elasticsearch,elasticsearch" /> elasticsearch 在弹性搜索中使用带有关键字数据类型的normalizer,得到意外结果,elasticsearch,elasticsearch" />

elasticsearch 在弹性搜索中使用带有关键字数据类型的normalizer,得到意外结果

elasticsearch 在弹性搜索中使用带有关键字数据类型的normalizer,得到意外结果,elasticsearch,elasticsearch,我创建了这样一个索引 PUT twitter { "settings": { "index": { "analysis": { "normalizer": { "caseinsensitive_exact_match_normalizer": { "filter": "lowercase", "type": "custom" } }, "

我创建了这样一个索引

PUT twitter
{
  "settings": {
    "index": {
      "analysis": {
        "normalizer": {
          "caseinsensitive_exact_match_normalizer": {
            "filter": "lowercase",
            "type": "custom"
          }
        },
        "analyzer": {
          "whitespace_lowercasefilter_analyzer": {
            "filter": "lowercase",
            "char_filter": "html_strip",
            "type": "custom",
            "tokenizer": "standard"
          }
        }
      }
    }
  },

  "mappings": {
    "test" : {
      "properties": {
        "col1" : {
          "type": "keyword"
        },
        "col2" : {
          "type": "keyword",
            "normalizer": "caseinsensitive_exact_match_normalizer"
        }
      } 
    }

  }
}
然后我在索引中插入值作为

POST twitter/test
{
  "col1" : "Dhruv",
  "col2" : "Dhruv"
}
GET twitter/_search
{
  "query": {
    "term": {
      "col2": {
        "value": "DHRUV"
      }
    }
  }
}
然后我查询索引为

POST twitter/test
{
  "col1" : "Dhruv",
  "col2" : "Dhruv"
}
GET twitter/_search
{
  "query": {
    "term": {
      "col2": {
        "value": "DHRUV"
      }
    }
  }
}
我得到了结果

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "twitter",
        "_type": "test",
        "_id": "AV9yNWQb3aJEm8NgRhd_",
        "_score": 0.2876821,
        "_source": {
          "col1": "Dhruv",
          "col2": "Dhruv"
        }
      }
    ]
  }
}
根据我的理解,我们不应该得到结果,因为术语查询忽略了分析,所以它应该在倒排索引中搜索
DHRUV
,在索引中存储的值应该是
DHRUV
,因为我们使用了
不区分大小写的\u精确匹配\u规范化器
。我怀疑这个术语查询没有忽略
规范化器
。是这样吗

< >我使用Es5.4.1

< P>为<代码>术语查询,在搜索时考虑正规化器。但是,正如前面提到的问题,已经确定这不是预期的行为

如果要查看ES将您的查询重写为哪种类型,可以使用以下方法:

GET /_validate/query?index=twitter&explain
{
  "query": {
    "term": {
      "col2": {
        "value": "DHRUV"
      }
    }
  }
}
这将告诉您为什么会得到这些结果:

  "explanations": [
    {
      "index": "twitter",
      "valid": true,
      "explanation": "col2:dhruv"
    }
  ]