elasticsearch 如何应用自定义分析器?
刚发现我们的弹性搜索有问题。它不会为字段名中的“&”返回任何内容。做了一些谷歌搜索,我想我需要一个自定义分析仪。以前从未使用过ES,假设我缺少一些基本的东西 这就是我所得到的,它并没有像预期的那样工作elasticsearch 如何应用自定义分析器?,elasticsearch,kibana,elasticsearch,Kibana,刚发现我们的弹性搜索有问题。它不会为字段名中的“&”返回任何内容。做了一些谷歌搜索,我想我需要一个自定义分析仪。以前从未使用过ES,假设我缺少一些基本的东西 这就是我所得到的,它并没有像预期的那样工作 PUT custom_analyser { "settings": { "analysis": { "analyzer": { "suggest_analyzer": { "type": "custom", "
PUT custom_analyser
{
"settings": {
"analysis": {
"analyzer": {
"suggest_analyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [ "lowercase", "my_synonym_filter" ]
}
},
"filter": {
"my_synonym_filter": {
"type": "synonym",
"synonyms": [
"&, and",
"foo, bar" ]
}
}
}
}
}
试着像这样使用它:
GET custom_analyser/_search
{
"aggs": {
"section": {
"terms": {
"field": "section",
"size": 10,
"shard_size": 500,
"include": "jill & jerry" //Not returning anything back for this field using default analyser
}
}
}
}
输出:
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
},
"aggregations": {
"section": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": []
}
}
}
映射
"_doc":{
"dynamic":"false",
"date_detection":false,
"properties":{
"section":{
"type":"keyword"
}
}
}
获取自定义的图形分析器:
{
"custom_analyser": {
"aliases": {},
"mappings": {},
"settings": {
"index": {
"number_of_shards": "5",
"provided_name": "custom_analyser",
"creation_date": "1565971369814",
"analysis": {
"filter": {
"my_synonym_filter": {
"type": "synonym",
"synonyms": [
"&, and",
"foo, bar"
]
}
},
"analyzer": {
"suggest_analyzer": {
"filter": [
"lowercase",
"my_synonym_filter"
],
"type": "custom",
"tokenizer": "whitespace"
}
}
},
"number_of_replicas": "1",
"uuid": "oVMOU5wPQ--vKhE3dDFG2Q",
"version": {
"created": "6030199"
}
}
}
}
}
我认为这里有一点混乱:分析器帮不了你,因为你(正确地)使用了一个
关键字
字段进行聚合,但这些字段没有被分析。您只能在这些字段上使用
对于您的特定问题:您需要避开&
,以使其按预期工作
完整示例
映射和示例数据:
PUT test
{
"mappings": {
"properties": {
"section": {
"type": "keyword"
}
}
}
}
PUT test/_doc/1
{
"section": "jill & jerry"
}
PUT test/_doc/2
{
"section": "jill jerry"
}
PUT test/_doc/3
{
"section": "jill"
}
PUT test/_doc/4
{
"section": "jill & jerry"
}
查询-您需要一个双反斜杠才能使转义在这里工作(我还排除了带有“size”:0的实际文档,以使响应更短):
答复:
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"section" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "jill & jerry",
"doc_count" : 2
}
]
}
}
}
你能展示你的索引映射吗?我没有添加任何映射,这就是我缺少的吗?相关字段的映射是:“节”:{“类型”:“关键字”}@Valkeyword字段未进行分析,是否可以显示您认为应该包含在聚合中的文档?@Val-有问题添加-感谢请显示运行get custom_Analyzer
时得到的内容,或者可以将其包装在数组中以进行精确匹配。
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"section" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "jill & jerry",
"doc_count" : 2
}
]
}
}
}