elasticsearch ElasticSearch查询/搜索/匹配
我在ElasticSearch索引中插入了3条记录,如下所示:elasticsearch ElasticSearch查询/搜索/匹配,elasticsearch,elasticsearch,我在ElasticSearch索引中插入了3条记录,如下所示: curl -XPOST 'http://127.0.0.1:9200/geoindex_test/STREET?pretty=1' -d ' { "cityNames" : [ { "language" : "ENG", "name" : "w bridgewater", "raw_name" : "W BRIDGEWATER" }, { "language" : "ENG", "name" : "
curl -XPOST 'http://127.0.0.1:9200/geoindex_test/STREET?pretty=1' -d '
{ "cityNames" : [ { "language" : "ENG",
"name" : "w bridgewater",
"raw_name" : "W BRIDGEWATER"
},
{ "language" : "ENG",
"name" : "west bridgewater",
"raw_name" : "West Bridgewater"
}
],
"id" : 1,
"streetNames" : [ { "language" : "ENG",
"name" : "cram rd",
"raw_name" : "Cram Rd"
} ]
}'
curl -XPOST 'http://127.0.0.1:9200/geoindex_test/STREET?pretty=1' -d '
{ "cityNames" : [ { "language" : "ENG",
"name" : "bridgewater corners",
"raw_name" : "BRIDGEWATER CORNERS"
},
{ "language" : "ENG",
"name" : "bridgewater center",
"raw_name" : "Bridgewater Center"
}
],
"id" : 2,
"streetNames" : [ { "language" : "ENG",
"name" : "valley view rd",
"raw_name" : "Valley View Rd"
} ]
}'
curl -XPOST 'http://127.0.0.1:9200/geoindex_test/STREET?pretty=1' -d '
{ "cityNames" : [ { "language" : "ENG",
"name" : "bridgewater",
"raw_name" : "Bridgewater"
},
{ "language" : "ENG",
"name" : "windsor",
"raw_name" : "Windsor"
}
],
"id" : 3,
"streetNames" : [ { "language" : "ENG",
"name" : "valley view rd",
"raw_name" : "Valley View Rd"
} ]
}'
curl -XGET 'http://127.0.0.1:9200/geoindex_test/STREET/_search?pretty=1' -d '
{
"query" : {
"match" : { "cityNames.name" : "bridgewater" }
}
}'
我执行如下搜索:
curl -XPOST 'http://127.0.0.1:9200/geoindex_test/STREET?pretty=1' -d '
{ "cityNames" : [ { "language" : "ENG",
"name" : "w bridgewater",
"raw_name" : "W BRIDGEWATER"
},
{ "language" : "ENG",
"name" : "west bridgewater",
"raw_name" : "West Bridgewater"
}
],
"id" : 1,
"streetNames" : [ { "language" : "ENG",
"name" : "cram rd",
"raw_name" : "Cram Rd"
} ]
}'
curl -XPOST 'http://127.0.0.1:9200/geoindex_test/STREET?pretty=1' -d '
{ "cityNames" : [ { "language" : "ENG",
"name" : "bridgewater corners",
"raw_name" : "BRIDGEWATER CORNERS"
},
{ "language" : "ENG",
"name" : "bridgewater center",
"raw_name" : "Bridgewater Center"
}
],
"id" : 2,
"streetNames" : [ { "language" : "ENG",
"name" : "valley view rd",
"raw_name" : "Valley View Rd"
} ]
}'
curl -XPOST 'http://127.0.0.1:9200/geoindex_test/STREET?pretty=1' -d '
{ "cityNames" : [ { "language" : "ENG",
"name" : "bridgewater",
"raw_name" : "Bridgewater"
},
{ "language" : "ENG",
"name" : "windsor",
"raw_name" : "Windsor"
}
],
"id" : 3,
"streetNames" : [ { "language" : "ENG",
"name" : "valley view rd",
"raw_name" : "Valley View Rd"
} ]
}'
curl -XGET 'http://127.0.0.1:9200/geoindex_test/STREET/_search?pretty=1' -d '
{
"query" : {
"match" : { "cityNames.name" : "bridgewater" }
}
}'
我认为ElasticSearch将返回第三条记录(id==3)作为最佳匹配(记录3是唯一与“bridgewater”完全匹配的记录),但它将返回id 1(w bridgewater)的记录作为最佳匹配。我做错了什么?我想这是因为你使用的是内部对象,基本上是将它下面的对象折叠成一个,用于搜索目的。因此,当您查询对象1的搜索字段时,例如,您查询的是[“w bridgewater”,“west bridgewater”],而不是您可能想象的离散字段 由于“bridgewater”在对象1和2(两个名称字段)中出现两次,而在对象3中出现一次,因此这些项目在搜索中排名较高。最终会拾取对象1,因为“bridgewater”出现的字段比对象2中的字符串短(“w bridgewater”与“bridgewater角”)
不要像现在这样使用内部对象,而是使用嵌套对象。将分数模式设置为“max”将使您更直观地了解情况。您可以看到关于在请求中启用解释输出的原因的详细说明。只需将
explain=true
请求参数添加到url。如果您能将输出添加到您的答案中,我将很乐意为您提供帮助。@javanna-谢谢您的回复。explain=true的输出超出了stackoverflow允许的字符数。抱歉,我无法提供信息。也许你可以发布相关部分或使用第三方服务,如pastebin或github gist。@javanna-我以前从未使用过pastebin。希望你能访问我的帖子:你得到的答案非常好。正如你所看到的,你的前两份文件都有tf(术语频率)2,原因与答案中所解释的完全相同。第三个文档的fieldNorm更高,这是表明它是完美匹配的因素,但由于术语频率只有一个,其他文档更相关。你知道吗?我喜欢你编辑答案的方式,有道理!即使看到解释的输出总是有帮助的!