elasticsearch 在Elasticsearch中搜索包含;不是";关键词
我在AWS(7.9版本)上使用ElasticSearch,并试图区分两个字符串 我的主要目标是将搜索结果分为“已找到”和“未找到” 一般的问题是如何搜索“not”关键字 下面是两个示例消息elasticsearch 在Elasticsearch中搜索包含;不是";关键词,elasticsearch,lucene,kibana,elasticsearch,Lucene,Kibana,我在AWS(7.9版本)上使用ElasticSearch,并试图区分两个字符串 我的主要目标是将搜索结果分为“已找到”和“未找到” 一般的问题是如何搜索“not”关键字 下面是两个示例消息 "CachingServiceOne:Found in cache - Retrieve." "CachingServiceThree:Not found in cache - Create new." 您可以使用,在“title”字段中搜索“not” 添加包含索
"CachingServiceOne:Found in cache - Retrieve."
"CachingServiceThree:Not found in cache - Create new."
您可以使用,在“title”
字段中搜索“not”
添加包含索引数据、映射、搜索查询和搜索结果的工作示例
索引映射:
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "ngram",
"min_gram": 3,
"max_gram": 5,
"token_chars": [
"letter",
"digit"
]
}
}
},
"max_ngram_diff": 10
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
{
"title":"CachingServiceThree:Not found in cache - Create new."
}
{
"title":"CachingServiceOne:Found in cache - Retrieve."
}
{
"query":{
"match":{
"title":"Not"
}
}
}
"hits": [
{
"_index": "67093372",
"_type": "_doc",
"_id": "2",
"_score": 0.6720003,
"_source": {
"title": "CachingServiceThree:Not found in cache - Create new."
}
}
]
索引数据:
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "ngram",
"min_gram": 3,
"max_gram": 5,
"token_chars": [
"letter",
"digit"
]
}
}
},
"max_ngram_diff": 10
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
{
"title":"CachingServiceThree:Not found in cache - Create new."
}
{
"title":"CachingServiceOne:Found in cache - Retrieve."
}
{
"query":{
"match":{
"title":"Not"
}
}
}
"hits": [
{
"_index": "67093372",
"_type": "_doc",
"_id": "2",
"_score": 0.6720003,
"_source": {
"title": "CachingServiceThree:Not found in cache - Create new."
}
}
]
搜索查询:
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "ngram",
"min_gram": 3,
"max_gram": 5,
"token_chars": [
"letter",
"digit"
]
}
}
},
"max_ngram_diff": 10
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
{
"title":"CachingServiceThree:Not found in cache - Create new."
}
{
"title":"CachingServiceOne:Found in cache - Retrieve."
}
{
"query":{
"match":{
"title":"Not"
}
}
}
"hits": [
{
"_index": "67093372",
"_type": "_doc",
"_id": "2",
"_score": 0.6720003,
"_source": {
"title": "CachingServiceThree:Not found in cache - Create new."
}
}
]
搜索结果:
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "ngram",
"min_gram": 3,
"max_gram": 5,
"token_chars": [
"letter",
"digit"
]
}
}
},
"max_ngram_diff": 10
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
{
"title":"CachingServiceThree:Not found in cache - Create new."
}
{
"title":"CachingServiceOne:Found in cache - Retrieve."
}
{
"query":{
"match":{
"title":"Not"
}
}
}
"hits": [
{
"_index": "67093372",
"_type": "_doc",
"_id": "2",
"_score": 0.6720003,
"_source": {
"title": "CachingServiceThree:Not found in cache - Create new."
}
}
]
嗯,问题似乎确实在于默认分析器的工作方式,而不是我无法搜索
not
单词这一事实。这就是我接受答案的原因。但我想补充一点。为了简单起见
:
上拆分单词
标题:CachingServiceThree\:Not
title
是字段名,:
必须转义\:
title::\:Not
和title:\ \
用这个小把戏把所有的东西都拿来了。我想知道使用一个包含所有实际值的数组是否会更快
通过检查面板转换为:
{
"query": {
"bool": {
"filter": [
{
"bool": {
"should": [
{
"query_string": {
"fields": [
"title"
],
"query": "*\\:Not"
}
}
],
"minimum_should_match": 1
}
}
]
}
}
}
嗯,那肯定会有用的。我以为我错过了什么。如果可以的话,我想再问一个问题。编写自己的分析器是常见的做法?这并不是说它们会产生很大的开销。@cr3a7ure如果没有指定分析器,elasticsearch将使用标准分析器。因此,如果您有一个特定的用例,那么您需要定义您自己的自定义分析器。