elasticsearch 在弹性搜索中使用带有关键字数据类型的normalizer,得到意外结果
我创建了这样一个索引
elasticsearch 在弹性搜索中使用带有关键字数据类型的normalizer,得到意外结果,
elasticsearch,
elasticsearch,我创建了这样一个索引 PUT twitter { "settings": { "index": { "analysis": { "normalizer": { "caseinsensitive_exact_match_normalizer": { "filter": "lowercase", "type": "custom" } }, "
PUT twitter
{
"settings": {
"index": {
"analysis": {
"normalizer": {
"caseinsensitive_exact_match_normalizer": {
"filter": "lowercase",
"type": "custom"
}
},
"analyzer": {
"whitespace_lowercasefilter_analyzer": {
"filter": "lowercase",
"char_filter": "html_strip",
"type": "custom",
"tokenizer": "standard"
}
}
}
}
},
"mappings": {
"test" : {
"properties": {
"col1" : {
"type": "keyword"
},
"col2" : {
"type": "keyword",
"normalizer": "caseinsensitive_exact_match_normalizer"
}
}
}
}
}
然后我在索引中插入值作为
POST twitter/test
{
"col1" : "Dhruv",
"col2" : "Dhruv"
}
GET twitter/_search
{
"query": {
"term": {
"col2": {
"value": "DHRUV"
}
}
}
}
然后我查询索引为
POST twitter/test
{
"col1" : "Dhruv",
"col2" : "Dhruv"
}
GET twitter/_search
{
"query": {
"term": {
"col2": {
"value": "DHRUV"
}
}
}
}
我得到了结果
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.2876821,
"hits": [
{
"_index": "twitter",
"_type": "test",
"_id": "AV9yNWQb3aJEm8NgRhd_",
"_score": 0.2876821,
"_source": {
"col1": "Dhruv",
"col2": "Dhruv"
}
}
]
}
}
根据我的理解,我们不应该得到结果,因为术语查询忽略了分析,所以它应该在倒排索引中搜索DHRUV
,在索引中存储的值应该是DHRUV
,因为我们使用了不区分大小写的\u精确匹配\u规范化器
。我怀疑这个术语查询没有忽略规范化器。是这样吗
< >我使用Es5.4.1< P>为<代码>术语查询,在搜索时考虑正规化器。但是,正如前面提到的问题,已经确定这不是预期的行为
如果要查看ES将您的查询重写为哪种类型,可以使用以下方法:
GET /_validate/query?index=twitter&explain
{
"query": {
"term": {
"col2": {
"value": "DHRUV"
}
}
}
}
这将告诉您为什么会得到这些结果:
"explanations": [
{
"index": "twitter",
"valid": true,
"explanation": "col2:dhruv"
}
]