日志存储索引JSON数组_Json_Logstash

日志存储索引JSON数组

json logstash

日志存储索引JSON数组,json,logstash,Json,Logstash,Logstash很棒。我可以像这样发送JSON（多行以便于阅读）：然后使用搜索词b.alpha:awesome在kibana中查询该行。很好但是，我现在有一个JSON日志行，如下所示： { "different":[ { "this": "one", "that": "uno" }, { "this": "two" } ] } 我希望能够通过类似不同的搜索找到这一行。这：两个（或不同。这：一个，或不同。那：uno）

Logstash很棒。我可以像这样发送JSON（多行以便于阅读）：

然后使用搜索词

b.alpha:awesome

在kibana中查询该行。很好

但是，我现在有一个JSON日志行，如下所示：

{
  "different":[
    {
      "this": "one",
      "that": "uno"
    },
    {
      "this": "two"
    }
  ]
}

我希望能够通过类似

不同的搜索找到这一行。这：两个（或不同。这：一个，或不同。那：uno
）
如果我直接使用Lucene，我会遍历different
数组，并为其中的每个哈希生成一个新的搜索索引，但Logstash目前似乎接受了这一行，如下所示：
{
  "different":[
    {
      "this": "one",
      "that": "uno"
    },
    {
      "this": "two"
    }
  ]
}

不同：{this:one，that:uno}，{this:two}
这不会帮助我使用different.this
或different.that来搜索日志行
有没有想过我可以对编解码器、筛选器或代码进行更改以启用此功能？
您可以编写自己的（复制和粘贴、重命名类名、config\u name
并重写筛选器（事件）
方法）或修改当前筛选器（在Github上）
您可以在以下路径中找到JSON过滤器（Ruby类）源代码logstash-1.x.x\lib\logstash\filters
名为JSON.rb
。JSON过滤器将内容解析为JSON，如下所示
begin
  # TODO(sissel): Note, this will not successfully handle json lists
  # like your text is '[ 1,2,3 ]' JSON.parse gives you an array (correctly)
  # which won't merge into a hash. If someone needs this, we can fix it
  # later.
  dest.merge!(JSON.parse(source))

  # If no target, we target the root of the event object. This can allow
  # you to overwrite @timestamp. If so, let's parse it as a timestamp!
  if !@target && event[TIMESTAMP].is_a?(String)
    # This is a hack to help folks who are mucking with @timestamp during
    # their json filter. You aren't supposed to do anything with
    # "@timestamp" outside of the date filter, but nobody listens... ;)
    event[TIMESTAMP] = Time.parse(event[TIMESTAMP]).utc
  end

  filter_matched(event)
rescue => e
  event.tag("_jsonparsefailure")
  @logger.warn("Trouble parsing json", :source => @source,
               :raw => event[@source], :exception => e)
  return
end

您可以修改解析过程以修改原始JSON
  json  = JSON.parse(source)
  if json.is_a?(Hash)
    json.each do |key, value| 
        if value.is_a?(Array)
            value.each_with_index do |object, index|
                #modify as you need
                object["index"]=index
            end
        end
    end
  end
  #save modified json
  ......
  dest.merge!(json)

然后，您可以修改配置文件以使用/your new/modified JSON过滤器，并将其放入\logstash-1.x.x\lib\logstash\config

这是我的elastic\u，带有一个修改过的json.rb
过滤器
input{
    stdin{

    }
}filter{
    json{
        source => "message"
    }
}output{
    elasticsearch{
        host=>localhost
    }stdout{

    }
}

如果要使用新过滤器，可以使用config\u名称对其进行配置
class LogStash::Filters::Json_index < LogStash::Filters::Base

  config_name "json_index"
  milestone 2
  ....
end

希望这能有所帮助。
为了快速而肮脏的破解，我使用了Ruby
过滤器和下面的代码，不再需要使用现成的“json”过滤器
input {
  stdin{}
}

filter {
  grok {
    match => ["message","(?<json_raw>.*)"]
  }
  ruby {
    init => "
      def parse_json obj, pname=nil, event
         obj = JSON.parse(obj) unless obj.is_a? Hash
         obj = obj.to_hash unless obj.is_a? Hash

         obj.each {|k,v|
         p = pname.nil?? k : pname
         if v.is_a? Array
           v.each_with_index {|oo,ii|               
             parse_json_array(oo,ii,p,event)
           }
           elsif v.is_a? Hash
             parse_json(v,p,event)
           else
             p = pname.nil?? k : [pname,k].join('.')
             event[p] = v
           end
         }
        end

        def parse_json_array obj, i,pname, event
          obj = JSON.parse(obj) unless obj.is_a? Hash
          pname_ = pname
          if obj.is_a? Hash
            obj.each {|k,v|
              p=[pname_,i,k].join('.')
              if v.is_a? Array
                v.each_with_index {|oo,ii|
                  parse_json_array(oo,ii,p,event)
                }
              elsif v.is_a? Hash
                parse_json(v,p, event)
              else
                event[p] = v
              end
            }
          else
            n = [pname_, i].join('.')
            event[n] = obj
          end
        end
      "
      code => "parse_json(event['json_raw'].to_s,nil,event) if event['json_raw'].to_s.include? ':'"
    }


  }

output {
  stdout{codec => rubydebug}
}

这就是它的输出
      {
           "message" => "{\"id\":123, \"members\":[{\"i\":1, \"arr\":[{\"ii\":11},{\"ii\":22}]},{\"i\":2}], \"im_json\":{\"id\":234, \"members\":[{\"i\":3},{\"i\":4}]}}",
          "@version" => "1",
        "@timestamp" => "2014-07-25T00:06:00.814Z",
              "host" => "Leis-MacBook-Pro.local",
          "json_raw" => "{\"id\":123, \"members\":[{\"i\":1, \"arr\":[{\"ii\":11},{\"ii\":22}]},{\"i\":2}], \"im_json\":{\"id\":234, \"members\":[{\"i\":3},{\"i\":4}]}}",
                "id" => 123,
       "members.0.i" => 1,
"members.0.arr.0.ii" => 11,
"members.0.arr.1.ii" => 22,
       "members.1.i" => 2,
           "im_json" => 234,
       "im_json.0.i" => 3,
       "im_json.1.i" => 4
      }

我喜欢的解决方案是ruby过滤器，因为它要求我们不要再编写另一个过滤器。然而，该解决方案创建的字段位于JSON的“根”上，很难跟踪原始文档的外观
我提出了一个类似的方法，它更容易理解，是一个递归的解决方案，因此更简洁
ruby {
    init => "
        def arrays_to_hash(h)
          h.each do |k,v|
            # If v is nil, an array is being iterated and the value is k.
            # If v is not nil, a hash is being iterated and the value is v.
            value = v || k
            if value.is_a?(Array)
                # "value" is replaced with "value_hash" later.
                value_hash = {}
                value.each_with_index do |v, i|
                    value_hash[i.to_s] = v
                end
                h[k] = value_hash
            end

            if value.is_a?(Hash) || value.is_a?(Array)
              arrays_to_hash(value)
            end
          end
        end
      "
      code => "arrays_to_hash(event.to_hash)"
}

它将数组转换为has，每个键作为索引号。更多详细信息：-在对数组进行索引之后，您希望的JSON格式是什么？随着时间的推移，这个解决方案应该仍然可以工作，但会以一种笨拙的方式-我会说。一个可预测的json结构最好使用预定义的映射，对于内部带有数组的不可预测的json，您仍然可以做类似的事情，但在您自己的自定义过滤器中，而不是在最近的elasticsearch版本中，字段名中不能有句点。如果您被此解决方案吸引，请使用其他角色。
      {
           "message" => "{\"id\":123, \"members\":[{\"i\":1, \"arr\":[{\"ii\":11},{\"ii\":22}]},{\"i\":2}], \"im_json\":{\"id\":234, \"members\":[{\"i\":3},{\"i\":4}]}}",
          "@version" => "1",
        "@timestamp" => "2014-07-25T00:06:00.814Z",
              "host" => "Leis-MacBook-Pro.local",
          "json_raw" => "{\"id\":123, \"members\":[{\"i\":1, \"arr\":[{\"ii\":11},{\"ii\":22}]},{\"i\":2}], \"im_json\":{\"id\":234, \"members\":[{\"i\":3},{\"i\":4}]}}",
                "id" => 123,
       "members.0.i" => 1,
"members.0.arr.0.ii" => 11,
"members.0.arr.1.ii" => 22,
       "members.1.i" => 2,
           "im_json" => 234,
       "im_json.0.i" => 3,
       "im_json.1.i" => 4
      }

ruby {
    init => "
        def arrays_to_hash(h)
          h.each do |k,v|
            # If v is nil, an array is being iterated and the value is k.
            # If v is not nil, a hash is being iterated and the value is v.
            value = v || k
            if value.is_a?(Array)
                # "value" is replaced with "value_hash" later.
                value_hash = {}
                value.each_with_index do |v, i|
                    value_hash[i.to_s] = v
                end
                h[k] = value_hash
            end

            if value.is_a?(Hash) || value.is_a?(Array)
              arrays_to_hash(value)
            end
          end
        end
      "
      code => "arrays_to_hash(event.to_hash)"
}