elasticsearch Logstash XML解析合并一个字段下的所有字段内容?
日志存储2.4.1 我想解析一个XML文件:elasticsearch Logstash XML解析合并一个字段下的所有字段内容?,elasticsearch,logstash,elasticsearch,Logstash,日志存储2.4.1 我想解析一个XML文件: <?xml version="1.0" encoding="ISO-8859-1"?> <catalog> <cd country="USA"> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <price>10.
<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>
<cd country="USA">
<title>Empire Burlesque</title>
<artist>Bob Dylan</artist>
<price>10.90</price>
</cd>
<cd country="UK">
<title>Hide your heart</title>
<artist>Bonnie Tyler</artist>
<price>10.0</price>
</cd>
<cd country="USA">
<title>Greatest Hits</title>
<artist>Dolly Parton</artist>
<price>9.90</price>
</cd>
</catalog>
但我得到的是这样的:
"country" => [
[0] "USA",
[1] "UK",
[2] "USA"
],
"title" => [
[0] "Empire Burlesque",
[1] "Hide your heart",
[2] "Greatest Hits"
],
"artist" => [
[0] "Bob Dylan",
[1] "Bonnie Tyler",
[2] "Dolly Parton"
],
"price" => [
[0] "10.90",
[1] "10.0",
[2] "9.90"
]
input {
file {
path => "F:\logstash-2.4.0\logstash-2.4.0\bin\samplexml.xml"
start_position => "beginning"
sincedb_path => "NUL"
codec => multiline {
pattern => "^<\?cd.*\>"
negate => true
what => "previous"
}
}
}
filter {
xml {
source => "message"
xpath =>
[
"/catalog/cd/@country", "country",
"/catalog/cd/title/text()", "title",
"/catalog/cd/artist/text()", "artist",
"/catalog/cd/price/text()", "price"
]
store_xml => false
target => "doc"
}
}
output {
stdout { codec => rubydebug }
}
我的日志存储配置如下所示:
"country" => [
[0] "USA",
[1] "UK",
[2] "USA"
],
"title" => [
[0] "Empire Burlesque",
[1] "Hide your heart",
[2] "Greatest Hits"
],
"artist" => [
[0] "Bob Dylan",
[1] "Bonnie Tyler",
[2] "Dolly Parton"
],
"price" => [
[0] "10.90",
[1] "10.0",
[2] "9.90"
]
input {
file {
path => "F:\logstash-2.4.0\logstash-2.4.0\bin\samplexml.xml"
start_position => "beginning"
sincedb_path => "NUL"
codec => multiline {
pattern => "^<\?cd.*\>"
negate => true
what => "previous"
}
}
}
filter {
xml {
source => "message"
xpath =>
[
"/catalog/cd/@country", "country",
"/catalog/cd/title/text()", "title",
"/catalog/cd/artist/text()", "artist",
"/catalog/cd/price/text()", "price"
]
store_xml => false
target => "doc"
}
}
output {
stdout { codec => rubydebug }
}
输入{
文件{
path=>“F:\logstash-2.4.0\logstash-2.4.0\bin\samplexml.xml”
开始位置=>“开始”
sincedb_路径=>“NUL”
编解码器=>多行{
模式=>“^”
否定=>true
什么=>“以前的”
}
}
}
滤器{
xml{
source=>“消息”
xpath=>
[
“/catalog/cd/@country”,“country”,
“/catalog/cd/title/text(),“title”,
“/catalog/cd/artist/text(),“artist”,
/catalog/cd/price/text(),“price”
]
store_xml=>false
目标=>“文档”
}
}
输出{
stdout{codec=>rubydebug}
}
如何从上述xml文件中获得所需的输出
谢谢< p>这不是你要的,但是如果你可以考虑替代LogStuffin,在当前目录中创建一个FoopiP.YML,例如:
pipelines:
-
when:
- queue: started
from:
- readfile: ./input.xml
do:
- parsexml
- select: $.catalog.cd[*] # This is a Json path expression
- map: ~
country: "#{@country}" # These are data binding expressions
title: "#{title}"
artist: "#{artist}"
price: "#{price}"
to:
- log
finally:
- exit
首先:
docker run -v %CD%:/project aretera/foopipes
(%CD%将被替换为Windows上当前目录的绝对路径)