使用Logstash和Xpath提取XML数据
我有以下示例XML数据:使用Logstash和Xpath提取XML数据,logstash,Logstash,我有以下示例XML数据: <root> <actors> <actor id="1" name="Christian Bale"></actor> <actor id="2" name="Liam Neeson"></actor> <actor id="3" nam
<root>
<actors>
<actor id="1" name="Christian Bale"></actor>
<actor id="2" name="Liam Neeson"></actor>
<actor id="3" name="Michael Caine"></actor>
</actors>
</root>
但我需要的是actor
索引,该索引定义为基于actor
创建的列,即id
和name
这是我运行配置时的日志:
使用捆绑的JDK:“OpenJDK 64位服务器VM警告:选项”
UseConcMarkSweepGC在9.0版中已被弃用,可能会
已在未来版本中删除。警告:非法反射访问
操作已发生警告:用户非法进行反射访问
org.jruby.ext.openssl.SecurityHelper
(文件:/C:/Users/CHEEWE~1.NGA/AppData/Local/Temp/jruby-11656/jruby5503754749915308062jopenssl.jar)
请参阅JavaSaleIt.MeasGeigIGest.Pvices警告:请考虑
将此报告给的维护人员
org.jruby.ext.openssl.SecurityHelper警告:使用
--非法访问=警告以启用进一步非法反射访问操作警告:所有非法访问操作都将被禁用
在将来的版本中将Logstash日志发送到D:/Logstash/logs时被拒绝
现在通过log4j2.properties配置
[2020-12-07T17:54:43527][INFO][logstash.runner]开始
Logstash{“Logstash.version”=>“7.10.0”,“jruby.version”=>“jruby
9.2.13.0(2.5.7)2020-08-03 9a89c94bcc OpenJDK 64位服务器VM 11.0.8+10 on 11.0.8+10+indy+jit[mswin32-x86_64]“}[2020-12-07T17:54:43843][WARN][logstash.config.source.multilocal]
忽略“pipelines.yml”文件,因为模块或命令行
指定了选项[2020-12-07T17:54:45899][INFO
][org.reflections.reflections]反射扫描1个URL需要43毫秒,
生成23个键和47个值[2020-12-07T17:54:47229][INFO
][logstash.outputs.elasticsearch][main]elasticsearch池URL
更新{:更改=>{:删除=>[],:添加=>[http://localhost:9200/]}}
[2020-12-07T17:54:47482][WARN][logstash.outputs.elasticsearch][main]
已还原到ES实例的连接{:url=>“http://localhost:9200/"}
[2020-12-07T17:54:47544][INFO][logstash.outputs.elasticsearch][main]
已确定ES输出版本{:ES_版本=>7}
[2020-12-07T17:54:47551][WARN][logstash.outputs.elasticsearch][main]
检测到6.x及以上群集:类型
事件字段将不被使用
要确定文档类型{:es\u version=>7}
[2020-12-07T17:54:47618][INFO][logstash.outputs.elasticsearch][main]
新的Elasticsearch输出{:class=>“LogStash::Outputs::Elasticsearch”,
:hosts=>[“//localhost:9200”]}[2020-12-07T17:54:47689][INFO
][logstash.outputs.elasticsearch][main]使用默认映射
模板{:es_版本=>7,:ecs_兼容性=>:disabled}
[2020-12-07T17:54:47786][INFO][logstash.outputs.elasticsearch][main]
正在尝试安装模板
{:manage_template=>{“index_patterns”=>“logstash-”,“version”=>60001,
“设置”=>{“索引.刷新间隔”=>“5s”,“碎片数”=>1,
“index.lifecycle.name”=>“日志存储策略”,
“index.lifecycle.rollover_alias”=>“logstash”},
“映射”=>{“动态模板”=>[{“消息”字段”=>{“路径匹配”=>“消息”,
“匹配映射类型”=>“字符串”,“映射”=>{“类型”=>“文本”,
“norms”=>false}}},{“string_字段”=>{“匹配”=>”,
“匹配映射类型”=>“字符串”,“映射”=>{“类型”=>“文本”,
“规范”=>false,“字段”=>{“关键字”=>{“类型”=>“关键字”,
“忽略上面的”=>256}],
“属性”=>{“@timestamp”=>{“类型”=>“日期”},
“@version”=>{“type”=>“keyword”},“geoip”=>{“dynamic”=>true,
“属性”=>{“ip”=>{“类型”=>“ip”},
“位置”=>{“类型”=>“地理点”},“纬度”=>{“类型”=>“半浮点数”},
“经度”=>{“类型”=>“半浮”}
[2020-12-07T17:54:47846][INFO][logstash.outputs.elasticsearch][main]
创建滚动别名
[2020-12-07T17:54:47964][INFO][logstash.javapipeline][main]
正在启动管道{:pipeline_id=>“main”,“pipeline.workers”=>8,
“pipeline.batch.size”=>125,“pipeline.batch.delay”=>50,
“管道最大飞行时间”=>1000,
“pipeline.sources”=>[“D:/logstash/bin/logstash simple.conf”],
:thread=>“#”}[2020-12-07T17:54:49256][INFO
][logstash.javapipeline][main]管道Java执行
初始化时间{“秒”=>1.29}[2020-12-07T17:54:49347][INFO
][logstash.javapipeline][main]管道已启动
{“pipeline.id”=>“main”}stdin插件现在正在等待输入:
[2020-12-07T17:54:49446][INFO][logstash.agent]管道
正在运行{:count=>1,:正在运行_管道=>[:main],
:非运行管道=>[]}[2020-12-07T17:54:49757][INFO
][logstash.agent]已成功启动logstash API
端点{:端口=>9600}
如果elasticsearch和logstash都在运行最新版本,则默认情况下启用ILM。在这种情况下,将忽略索引选项的值,默认索引名称为logstash-{now/d}-00001。如果要使用索引选项设置索引名称,请将ILM_enabled选项设置为false
input
{
file
{
path => "D:/data.xml"
start_position => "beginning"
sincedb_path => "NUL"
exclude => "*.gz"
type => "xml"
codec => multiline {
pattern => "<?xml "
negate => "true"
what => "previous"
}
}
}
filter {
xml{
source => "message"
store_xml => true target => "id"
target => "root"
xpath => [
"/root/actors/actor/text()", "actor"
]
}
}
output{
elasticsearch{
hosts => ["http://localhost:9200/"]
index => "actor"
}
stdout
{
codec => rubydebug
}
}