日志存储索引JSON数组
Logstash很棒。我可以像这样发送JSON(多行以便于阅读): 然后使用搜索词日志存储索引JSON数组,json,logstash,Json,Logstash,Logstash很棒。我可以像这样发送JSON(多行以便于阅读): 然后使用搜索词b.alpha:awesome在kibana中查询该行。很好 但是,我现在有一个JSON日志行,如下所示: { "different":[ { "this": "one", "that": "uno" }, { "this": "two" } ] } 我希望能够通过类似不同的搜索找到这一行。这:两个(或不同。这:一个,或不同。那:uno)
b.alpha:awesome
在kibana中查询该行。很好
但是,我现在有一个JSON日志行,如下所示:
{
"different":[
{
"this": "one",
"that": "uno"
},
{
"this": "two"
}
]
}
我希望能够通过类似不同的搜索找到这一行。这:两个(或不同。这:一个,或不同。那:uno
)
如果我直接使用Lucene,我会遍历different
数组,并为其中的每个哈希生成一个新的搜索索引,但Logstash目前似乎接受了这一行,如下所示:
{
"different":[
{
"this": "one",
"that": "uno"
},
{
"this": "two"
}
]
}
不同:{this:one,that:uno},{this:two}
这不会帮助我使用different.this
或different.that来搜索日志行
有没有想过我可以对编解码器、筛选器或代码进行更改以启用此功能?您可以编写自己的(复制和粘贴、重命名类名、config\u name
并重写筛选器(事件)
方法)或修改当前筛选器(在Github上)
您可以在以下路径中找到JSON过滤器(Ruby类)源代码logstash-1.x.x\lib\logstash\filters
名为JSON.rb
。JSON过滤器将内容解析为JSON,如下所示
begin
# TODO(sissel): Note, this will not successfully handle json lists
# like your text is '[ 1,2,3 ]' JSON.parse gives you an array (correctly)
# which won't merge into a hash. If someone needs this, we can fix it
# later.
dest.merge!(JSON.parse(source))
# If no target, we target the root of the event object. This can allow
# you to overwrite @timestamp. If so, let's parse it as a timestamp!
if !@target && event[TIMESTAMP].is_a?(String)
# This is a hack to help folks who are mucking with @timestamp during
# their json filter. You aren't supposed to do anything with
# "@timestamp" outside of the date filter, but nobody listens... ;)
event[TIMESTAMP] = Time.parse(event[TIMESTAMP]).utc
end
filter_matched(event)
rescue => e
event.tag("_jsonparsefailure")
@logger.warn("Trouble parsing json", :source => @source,
:raw => event[@source], :exception => e)
return
end
您可以修改解析过程以修改原始JSON
json = JSON.parse(source)
if json.is_a?(Hash)
json.each do |key, value|
if value.is_a?(Array)
value.each_with_index do |object, index|
#modify as you need
object["index"]=index
end
end
end
end
#save modified json
......
dest.merge!(json)
然后,您可以修改配置文件以使用/your new/modified JSON过滤器,并将其放入\logstash-1.x.x\lib\logstash\config
这是我的elastic\u,带有一个修改过的json.rb
过滤器
input{
stdin{
}
}filter{
json{
source => "message"
}
}output{
elasticsearch{
host=>localhost
}stdout{
}
}
如果要使用新过滤器,可以使用config\u名称对其进行配置
class LogStash::Filters::Json_index < LogStash::Filters::Base
config_name "json_index"
milestone 2
....
end
希望这能有所帮助。为了快速而肮脏的破解,我使用了Ruby
过滤器和下面的代码,不再需要使用现成的“json”过滤器
input {
stdin{}
}
filter {
grok {
match => ["message","(?<json_raw>.*)"]
}
ruby {
init => "
def parse_json obj, pname=nil, event
obj = JSON.parse(obj) unless obj.is_a? Hash
obj = obj.to_hash unless obj.is_a? Hash
obj.each {|k,v|
p = pname.nil?? k : pname
if v.is_a? Array
v.each_with_index {|oo,ii|
parse_json_array(oo,ii,p,event)
}
elsif v.is_a? Hash
parse_json(v,p,event)
else
p = pname.nil?? k : [pname,k].join('.')
event[p] = v
end
}
end
def parse_json_array obj, i,pname, event
obj = JSON.parse(obj) unless obj.is_a? Hash
pname_ = pname
if obj.is_a? Hash
obj.each {|k,v|
p=[pname_,i,k].join('.')
if v.is_a? Array
v.each_with_index {|oo,ii|
parse_json_array(oo,ii,p,event)
}
elsif v.is_a? Hash
parse_json(v,p, event)
else
event[p] = v
end
}
else
n = [pname_, i].join('.')
event[n] = obj
end
end
"
code => "parse_json(event['json_raw'].to_s,nil,event) if event['json_raw'].to_s.include? ':'"
}
}
output {
stdout{codec => rubydebug}
}
这就是它的输出
{
"message" => "{\"id\":123, \"members\":[{\"i\":1, \"arr\":[{\"ii\":11},{\"ii\":22}]},{\"i\":2}], \"im_json\":{\"id\":234, \"members\":[{\"i\":3},{\"i\":4}]}}",
"@version" => "1",
"@timestamp" => "2014-07-25T00:06:00.814Z",
"host" => "Leis-MacBook-Pro.local",
"json_raw" => "{\"id\":123, \"members\":[{\"i\":1, \"arr\":[{\"ii\":11},{\"ii\":22}]},{\"i\":2}], \"im_json\":{\"id\":234, \"members\":[{\"i\":3},{\"i\":4}]}}",
"id" => 123,
"members.0.i" => 1,
"members.0.arr.0.ii" => 11,
"members.0.arr.1.ii" => 22,
"members.1.i" => 2,
"im_json" => 234,
"im_json.0.i" => 3,
"im_json.1.i" => 4
}
我喜欢的解决方案是ruby过滤器,因为它要求我们不要再编写另一个过滤器。然而,该解决方案创建的字段位于JSON的“根”上,很难跟踪原始文档的外观
我提出了一个类似的方法,它更容易理解,是一个递归的解决方案,因此更简洁
ruby {
init => "
def arrays_to_hash(h)
h.each do |k,v|
# If v is nil, an array is being iterated and the value is k.
# If v is not nil, a hash is being iterated and the value is v.
value = v || k
if value.is_a?(Array)
# "value" is replaced with "value_hash" later.
value_hash = {}
value.each_with_index do |v, i|
value_hash[i.to_s] = v
end
h[k] = value_hash
end
if value.is_a?(Hash) || value.is_a?(Array)
arrays_to_hash(value)
end
end
end
"
code => "arrays_to_hash(event.to_hash)"
}
它将数组转换为has,每个键作为索引号。更多详细信息:-在对数组进行索引之后,您希望的JSON格式是什么?随着时间的推移,这个解决方案应该仍然可以工作,但会以一种笨拙的方式-我会说。一个可预测的json结构最好使用预定义的映射,对于内部带有数组的不可预测的json,您仍然可以做类似的事情,但在您自己的自定义过滤器中,而不是在最近的elasticsearch版本中,字段名中不能有句点。如果您被此解决方案吸引,请使用其他角色。
{
"message" => "{\"id\":123, \"members\":[{\"i\":1, \"arr\":[{\"ii\":11},{\"ii\":22}]},{\"i\":2}], \"im_json\":{\"id\":234, \"members\":[{\"i\":3},{\"i\":4}]}}",
"@version" => "1",
"@timestamp" => "2014-07-25T00:06:00.814Z",
"host" => "Leis-MacBook-Pro.local",
"json_raw" => "{\"id\":123, \"members\":[{\"i\":1, \"arr\":[{\"ii\":11},{\"ii\":22}]},{\"i\":2}], \"im_json\":{\"id\":234, \"members\":[{\"i\":3},{\"i\":4}]}}",
"id" => 123,
"members.0.i" => 1,
"members.0.arr.0.ii" => 11,
"members.0.arr.1.ii" => 22,
"members.1.i" => 2,
"im_json" => 234,
"im_json.0.i" => 3,
"im_json.1.i" => 4
}
ruby {
init => "
def arrays_to_hash(h)
h.each do |k,v|
# If v is nil, an array is being iterated and the value is k.
# If v is not nil, a hash is being iterated and the value is v.
value = v || k
if value.is_a?(Array)
# "value" is replaced with "value_hash" later.
value_hash = {}
value.each_with_index do |v, i|
value_hash[i.to_s] = v
end
h[k] = value_hash
end
if value.is_a?(Hash) || value.is_a?(Array)
arrays_to_hash(value)
end
end
end
"
code => "arrays_to_hash(event.to_hash)"
}