elasticsearch Elasticsearch查询“;“必须匹配”;日志,elasticsearch,kibana,elastic-stack,kibana-4,elasticsearch,Kibana,Elastic Stack,Kibana 4" /> elasticsearch Elasticsearch查询“;“必须匹配”;日志,elasticsearch,kibana,elastic-stack,kibana-4,elasticsearch,Kibana,Elastic Stack,Kibana 4" />

elasticsearch Elasticsearch查询“;“必须匹配”;日志

elasticsearch Elasticsearch查询“;“必须匹配”;日志,elasticsearch,kibana,elastic-stack,kibana-4,elasticsearch,Kibana,Elastic Stack,Kibana 4,我希望使用ElasticSearch查询查找日志中的以下内容: 2014-07-02 20:52:39 INFO home.helloworld: LOGGER/LOG:ID1234 has successfully been received, {"uuid"="abc123"} 2014-07-02 20:52:39 INFO home.helloworld: LOGGER/LOG:ID1234 has successfully been transferred, {"uuid"="abc1

我希望使用ElasticSearch查询查找日志中的以下内容:

2014-07-02 20:52:39 INFO home.helloworld: LOGGER/LOG:ID1234 has successfully been received, {"uuid"="abc123"}
2014-07-02 20:52:39 INFO home.helloworld: LOGGER/LOG:ID1234 has successfully been transferred, {"uuid"="abc123"}
2014-07-02 20:52:39 INFO home.byebyeworld: LOGGER/LOG:ID1234 has successfully been processed, {"uuid"="abc123"}
2014-07-02 20:52:39 INFO home.byebyeworld: LOGGER/LOG:ID1234 has exited, {"uuid"="abc123"}
2014-07-02 20:53:00 INFO home.helloworld: LOGGER/LOG:ID1234 has successfully been received, {"uuid"="def123"}
2014-07-02 20:53:00 INFO home.helloworld: LOGGER/LOG:ID1234 has successfully been transferred, {"uuid"="def123"}
2014-07-02 20:53:00 INFO home.byebyeworld: LOGGER/LOG:ID1234 has successfully been processed, {"uuid"="def123"}
2014-07-02 20:53:00 INFO home.byebyeworld: LOGGER/LOG:ID1234 has exited, {"uuid"="def123"}
因为上面的每一行在elasticsearch中都表示为单个“消息”,所以我很难使用POST rest调用来查询它。我尝试使用下面的“必须匹配”只获取日志的第1行,但它并不一致,有时它返回多个命中,而不是仅返回一个命中:

{
   "query" : {
      "constant_score" : { 
         "filter" : {
            "bool" : {
              "must" : [
                 { "match_phrase_prefix" : {"message" : "home.helloworld:"}}, 
                 { "match_phrase_prefix" : {"message" : "LOGGER/LOG:ID1234"}},
                 { "match" : {"message" : "received, {\"uuid\"=\"abc123\"}"}} 
              ]
           }
         }
      }
   }
}

以上elasticsearch查询是否有问题?我认为“必须”等于和,而“匹配”更多的是包含,“匹配短语前缀”是开始的吗?有人能告诉我如何正确地查询一个日志,上面有不同uuid号的日志,并且只返回一次点击?最初我以为我得到了上面的查询,它首先只返回了1个命中率,然后返回了2个,然后更多。这对我来说是不一致的。提前谢谢你

问题在于您的
bool
查询的第3子句。让我给你两个问题,这对你来说很有用,我会解释他们为什么这么做

第一次查询

curl -XGET http://localhost:9200/my_logs/_search -d '
{
   "query" : {
      "constant_score" : {
         "filter" : {
            "bool" : {
              "must" : [
                 { "match_phrase_prefix" : {"message" : "home.helloworld:"}},
                 { "match_phrase_prefix" : {"message" : "LOGGER/LOG:ID1234"}},
                 { "match" : {
                     "message" : {
                        "query": "received, {\"uuid\"=\"abc123\"", 
                        "operator": "and"
                     }
                   }
                 }
              ]
           }
         }
      }
   }
}'
curl -XGET http://localhost:9200/my_logs/_search -d '
{
   "query" : {
      "constant_score" : {
         "filter" : {
            "bool" : {
              "must" : [
                 { "match_phrase_prefix" : {"message" : "home.helloworld:"}},
                 { "match_phrase_prefix" : {"message" : "LOGGER/LOG:ID1234"}},
                 { "match_phrase_prefix" : {"received, {\"uuid\"=\"abc123\""}}
              ]
           }
         }
      }
   }
}'
解释

让我们确保我们在索引问题上意见一致。默认情况下,索引器将通过标准分析链传递数据。也就是说,按空格分割,减少特殊字符,降低大小写,等等。所以在索引中,我们只会有带位置的标记

由于全文查询将接收您的查询文本“
,{\“uuid\”=“abc123\”
”,并将通过分析。默认情况下,此分析仅按空格分割文本,减少特殊字符,降低大小写,等等。此分析的结果与此类似(简化):
已接收
uuid
abc123

它将使用默认的
运算符(即
)将这些标记与
消息
字段相结合。因此,作为逻辑表达式,最后一个子句(
匹配查询
)将如下所示:
消息:已接收或消息:uuid或消息:abc123

这就是为什么前4个日志条目将匹配。我能够复制它

第二次查询

curl -XGET http://localhost:9200/my_logs/_search -d '
{
   "query" : {
      "constant_score" : {
         "filter" : {
            "bool" : {
              "must" : [
                 { "match_phrase_prefix" : {"message" : "home.helloworld:"}},
                 { "match_phrase_prefix" : {"message" : "LOGGER/LOG:ID1234"}},
                 { "match" : {
                     "message" : {
                        "query": "received, {\"uuid\"=\"abc123\"", 
                        "operator": "and"
                     }
                   }
                 }
              ]
           }
         }
      }
   }
}'
curl -XGET http://localhost:9200/my_logs/_search -d '
{
   "query" : {
      "constant_score" : {
         "filter" : {
            "bool" : {
              "must" : [
                 { "match_phrase_prefix" : {"message" : "home.helloworld:"}},
                 { "match_phrase_prefix" : {"message" : "LOGGER/LOG:ID1234"}},
                 { "match_phrase_prefix" : {"received, {\"uuid\"=\"abc123\""}}
              ]
           }
         }
      }
   }
}'
解释

记住:我们的索引过程留下了标记及其在索引中的位置

查询实际在做什么-它以输入查询为例(以“
received,{\“uuid\”=“abc123\”
”为例),执行与查询文本分析完全相同的步骤。并尝试在索引中的相邻位置查找标记
received
uuid
abc123
。顺序相同:
received
->
uuid
->
abc123
(几乎)

除了最后一个令牌,在我们的例子中是
abc123
。准确地说,它将为最后一个令牌生成通配符。即
接收的
->
uuid
->
abc123*

作为完美主义者,我想补充一点,即
接收的
->
uuid
->
abc123
(即,最后没有通配符)是实际查询所做的。它还计算索引中的位置,即尝试匹配“短语”,而不仅仅是随机位置中的单独标记