elasticsearch 在Elasticsearch中搜索包含;不是";关键词,elasticsearch,lucene,kibana,elasticsearch,Lucene,Kibana" /> elasticsearch 在Elasticsearch中搜索包含;不是";关键词,elasticsearch,lucene,kibana,elasticsearch,Lucene,Kibana" />

elasticsearch 在Elasticsearch中搜索包含;不是";关键词

elasticsearch 在Elasticsearch中搜索包含;不是";关键词,elasticsearch,lucene,kibana,elasticsearch,Lucene,Kibana,我在AWS(7.9版本)上使用ElasticSearch,并试图区分两个字符串 我的主要目标是将搜索结果分为“已找到”和“未找到” 一般的问题是如何搜索“not”关键字 下面是两个示例消息 "CachingServiceOne:Found in cache - Retrieve." "CachingServiceThree:Not found in cache - Create new." 您可以使用,在“title”字段中搜索“not” 添加包含索

我在AWS(7.9版本)上使用ElasticSearch,并试图区分两个字符串

我的主要目标是将搜索结果分为“已找到”和“未找到”

一般的问题是如何搜索“not”关键字

下面是两个示例消息

 "CachingServiceOne:Found in cache - Retrieve."
 "CachingServiceThree:Not found in cache - Create new."
您可以使用,在
“title”
字段中搜索
“not”

添加包含索引数据、映射、搜索查询和搜索结果的工作示例

索引映射:

{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "my_tokenizer"
        }
      },
      "tokenizer": {
        "my_tokenizer": {
          "type": "ngram",
          "min_gram": 3,
          "max_gram": 5,
          "token_chars": [
            "letter",
            "digit"
          ]
        }
      }
    },
    "max_ngram_diff": 10
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "my_analyzer"
      }
    }
  }
}
{
    "title":"CachingServiceThree:Not found in cache - Create new."
}
{
    "title":"CachingServiceOne:Found in cache - Retrieve."
}
{
  "query":{
    "match":{
      "title":"Not"
    }
  }
}
"hits": [
      {
        "_index": "67093372",
        "_type": "_doc",
        "_id": "2",
        "_score": 0.6720003,
        "_source": {
          "title": "CachingServiceThree:Not found in cache - Create new."
        }
      }
    ]
索引数据:

{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "my_tokenizer"
        }
      },
      "tokenizer": {
        "my_tokenizer": {
          "type": "ngram",
          "min_gram": 3,
          "max_gram": 5,
          "token_chars": [
            "letter",
            "digit"
          ]
        }
      }
    },
    "max_ngram_diff": 10
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "my_analyzer"
      }
    }
  }
}
{
    "title":"CachingServiceThree:Not found in cache - Create new."
}
{
    "title":"CachingServiceOne:Found in cache - Retrieve."
}
{
  "query":{
    "match":{
      "title":"Not"
    }
  }
}
"hits": [
      {
        "_index": "67093372",
        "_type": "_doc",
        "_id": "2",
        "_score": 0.6720003,
        "_source": {
          "title": "CachingServiceThree:Not found in cache - Create new."
        }
      }
    ]
搜索查询:

{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "my_tokenizer"
        }
      },
      "tokenizer": {
        "my_tokenizer": {
          "type": "ngram",
          "min_gram": 3,
          "max_gram": 5,
          "token_chars": [
            "letter",
            "digit"
          ]
        }
      }
    },
    "max_ngram_diff": 10
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "my_analyzer"
      }
    }
  }
}
{
    "title":"CachingServiceThree:Not found in cache - Create new."
}
{
    "title":"CachingServiceOne:Found in cache - Retrieve."
}
{
  "query":{
    "match":{
      "title":"Not"
    }
  }
}
"hits": [
      {
        "_index": "67093372",
        "_type": "_doc",
        "_id": "2",
        "_score": 0.6720003,
        "_source": {
          "title": "CachingServiceThree:Not found in cache - Create new."
        }
      }
    ]
搜索结果:

{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "my_tokenizer"
        }
      },
      "tokenizer": {
        "my_tokenizer": {
          "type": "ngram",
          "min_gram": 3,
          "max_gram": 5,
          "token_chars": [
            "letter",
            "digit"
          ]
        }
      }
    },
    "max_ngram_diff": 10
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "my_analyzer"
      }
    }
  }
}
{
    "title":"CachingServiceThree:Not found in cache - Create new."
}
{
    "title":"CachingServiceOne:Found in cache - Retrieve."
}
{
  "query":{
    "match":{
      "title":"Not"
    }
  }
}
"hits": [
      {
        "_index": "67093372",
        "_type": "_doc",
        "_id": "2",
        "_score": 0.6720003,
        "_source": {
          "title": "CachingServiceThree:Not found in cache - Create new."
        }
      }
    ]

嗯,问题似乎确实在于默认分析器的工作方式,而不是我无法搜索
not
单词这一事实。这就是我接受答案的原因。但我想补充一点。为了简单起见

  • 默认分析器不会在
    上拆分单词

  • 这意味着,我们必须搜索
    标题:CachingServiceThree\:Not

  • 其中,
    title
    是字段名,
    必须转义
    \:

  • 诀窍是使用KQL语法找到
    title::\:Not
    title:\ \

    用这个小把戏把所有的东西都拿来了。我想知道使用一个包含所有实际值的数组是否会更快

    通过检查面板转换为:

    {
      "query": {
        "bool": {
          "filter": [
            {
              "bool": {
                "should": [
                  {
                    "query_string": {
                      "fields": [
                        "title"
                      ],
                      "query": "*\\:Not"
                    }
                  }
                ],
                "minimum_should_match": 1
              }
            }
          ]
        }
     }
    }
    

    嗯,那肯定会有用的。我以为我错过了什么。如果可以的话,我想再问一个问题。编写自己的分析器是常见的做法?这并不是说它们会产生很大的开销。@cr3a7ure如果没有指定分析器,elasticsearch将使用标准分析器。因此,如果您有一个特定的用例,那么您需要定义您自己的自定义分析器。