elasticsearch Logstash从不同类型的消息中提取数据
下面是我从自动化平台获得的日志类型的3个示例。我期待着提取自定义选项部分。我遇到的挑战是自定义选项部分可能有很多。我想我需要做的是拆分自定义选项数组,然后解析它。我曾尝试过logstash解剖、grok和变异,并努力获取数据
elasticsearch Logstash从不同类型的消息中提取数据,
elasticsearch,logstash,logstash-grok,logstash-configuration,
elasticsearch,Logstash,Logstash Grok,Logstash Configuration,下面是我从自动化平台获得的日志类型的3个示例。我期待着提取自定义选项部分。我遇到的挑战是自定义选项部分可能有很多。我想我需要做的是拆分自定义选项数组,然后解析它。我曾尝试过logstash解剖、grok和变异,并努力获取数据 2020-12-09_18:06:30.58027 executing local task [refId:3122, lockTimeout:330000, lockTtl:300000, jobType:jobTemplateExecute, lockId:job.ex
2020-12-09_18:06:30.58027 executing local task [refId:3122, lockTimeout:330000, lockTtl:300000, jobType:jobTemplateExecute, lockId:job.execute.3122, jobTemplateId:3122, jobDate:1607537190133, userId:1897, customConfig:{"AnsibleRequestedUser":"testing1","AnsibleRequestedUserPassword":"VMware321!"}, jobTemplateExecutionId:5677, customInputs:[customOptions:[AnsibleRequestedUser:testing1, AnsibleRequestedUserPassword:VMware321!]], processConfig:[accountId:947, status:executing, username:user1, userId:1897, userDisplayName:user1 user1, refType:jobTemplate, refId:3122, timerCategory:TEST: 0. Enterprise Create User, timerSubCategory:3122, description: Enterprise Create User], processMap:[success:true, refType:jobTemplate, refId:3122, subType:null, subId:null, process: : 25172, timerCategory:TEST: 0. OpenManage Enterprise Create User, timerSubCategory:3122, zoneId:null, processId:25172], taskConfig:[:],:@45eb737f]
2020-12-09_15:33:43.21913 executing local task [refId:3117, lockTimeout:330000, lockTtl:300000, jobType:jobTemplateExecute, lockId:job.execute.3117, jobTemplateId:3117, jobDate:1607528023018, userId:320, customConfig:null, jobTemplateExecutionId:5667, customInputs:[customOptions:[AnsibleIdentPoolDesc:asdf123, AnsibleIdentPoolCount:50, TrackingUseCase:Customer Demo/Training, AnsiblePoolName:asdf123]], processConfig:[accountId:2, status:executing, username:user@company.com, userId:320, userDisplayName:user, refType:jobTemplate, refId:3117, timerCategory:TEST: 2. Enterprise - Create Identity Pool, timerSubCategory:3117, description:TEST: 2. Enterprise - Create Identity Pool], processMap:[success:true, refType:jobTemplate, refId:3117, subType:null, subId:null, process: : 25147, timerCategory:TEST: 2. Enterprise - Create Identity Pool, timerSubCategory:3117, zoneId:null, processId:25147], taskConfig:[:], :@21ff5f47]
2020-12-09_15:30:53.83030 executing local task [refId:3112, lockTimeout:330000, lockTtl:300000, jobType:jobTemplateExecute, lockId:job.execute.3112, jobTemplateId:3112, jobDate:1607527853230, userId:320, customConfig:null, jobTemplateExecutionId:5662, customInputs:[customOptions:[ReferenceServer:10629, ReferenceServerTemplateName:asdfasdf, TrackingUseCase:Internal Testing/Training, ReferenceServerTemplateDescription:asdfasdf]], processConfig:[accountId:2, status:executing, username:user@company.com, userId:320, userDisplayName:user, refType:jobTemplate, refId:3112, timerCategory:TEST: 1. Enterprise - Create Template From Reference Device, timerSubCategory:3112, description:TEST: 1. Enterprise - Create Template From Reference Device], processMap:[success:true, refType:jobTemplate, refId:3112, subType:null, subId:null, process: : 25142, timerCategory:TEST: 1. Enterprise - Create Template From Reference Device, timerSubCategory:3112, zoneId:null, processId:25142], taskConfig:[:],:@29ac1e41]
数据需要从上面的消息中获取以下内容
信息1:
[customOptions:[AnsibleRequestedUser:testing1,
AnsibleRequestedUserPassword:VMware321!]]我想把它们放在里面
一个新领域。用户名:user1需要在一个字段中包含它。
时间类别:测试:0。企业创建用户需要在
场
其余数据可以保留在原始字段消息中
信息2:
[自定义选项:[AnsibleIdentintPooldesc:asdf123,
AnsibleIndentPoolCount:50,跟踪案例:客户演示/培训,
AnsiblePoolName:asdf123]]-我需要将这些文件分为不同的部分
领域。用户名:user@company.com需要一个领域。
时间类别:测试:2。企业-创建标识池-我需要输入
田野
信息3:
[自定义选项:[参考服务器:10629,
ReferenceServerTemplateName:asdfasdf,TrackingUseCase:Internal
测试/培训,参考服务器模板说明:asdfasdf]],-I
需要将这些字段分隔为单独的字段。用户名:user@company.com
- 需要一个领域。时间类别:测试:1。企业-从参考设备创建模板-需要是一个字段
自定义选项将不断变化-这意味着取决于自动化的启动将决定更多的自定义选项,但上述格式应保持不变。 用户名可以是电子邮件或通用名称 下面是我尝试过的一些日志隐藏过滤器,虽然取得了一些成功,但并没有处理日志消息不断变化的性质
# Testing a new method to get information from the logs.
#if "executing local task" in [message] and "beats" in [tags]{
# dissect {
# mapping => {
# "message" => "%{date} %{?skip1} %{?skip2} %{?skip3} %{?refid} %{?lockTimeout} %{?lockTtl} %{?jobtemplate} %{?jobType} %{?jobTemplateId} %{?jobDate} %{?userId} %{?jobTemplateExecutionId} %{?jobTemplateExecutionId1} customInputs:[customOptions:[%{?RequestedPassword}:%{?RequestedPassword} %{?TrackingUseCase1}:%{TrackingUseCase}, %{?RequestedUser}, %{?processConfig}, %{?status}, username:%{username}, %{?userId}, %{?userDisplayName}, %{?refType}, %{?refID}, %{?timerCategory}:%{TaskName}, %{?timeCat}, %{?description}, %{?extra}"
# }
# }
#}
# Testing Grok Filters instead.
if "executing local task" in [messages] and "beats" in [tags]{
grok {
match => { "message" => "%{YEAR:year}-%{MONTHNUM2:month}-%{MONTHDAY:day}_%{TIME:time}%{SPACE}%{CISCO_REASON}%{SYSLOG5424PRINTASCII}%{SPACE}%{NOTSPACE}%{SPACE}%{NOTSPACE}%{SPACE}%{PROG}%{SPACE}%{PROG}%{SPACE}%{PROG}%{SPACE}%{PROG}%{SPACE}%{PROG}%{SPACE}%{PROG}%{SPACE}%{PROG}%{SPACE}%{SYSLOGPROG}%{SYSLOG5424SD:testing3}%{NOTSPACE}%{SPACE}%{PROG}%{SYSLOG5424SD:testing2}%{NOTSPACE}%{SPACE}%{PROG}%{SYSLOG5424SD:testing}%{GREEDYDATA}}"
}
}
}
我认为grok是我需要使用的,但不熟悉如何拆分/添加字段以满足上述需求
任何帮助都将不胜感激。我建议不要尝试在单个过滤器中完成所有操作,尤其是在单个grok模式中。我首先使用dissect去掉时间戳。我将其保存在[@metadata]字段中,以便可以在日志存储管道中访问它,但不会被输出处理
dissect { mapping => { "message" => "%{[@metadata][timestamp]} %{} [%{[@metadata][restOfline]}" } }
date { match => [ "[@metadata][timestamp]", "YYYY-MM-dd_HH:mm:ss.SSSSS" ] }
接下来,我将使用grok模式分解剩余的线。如果您只需要processConfig中的字段,那么这就是您需要的唯一grok模式。我提供了其他示例,说明如何从一条消息中提取多个模式
grok {
break_on_match => false
match => {
"[@metadata][restOfline]" => [
"customOptions:\[(?<[@metadata][customOptions]>[^\]]+)",
"processConfig:\[(?<[@metadata][processConfig]>[^\]]+)",
"processMap:\[(?<[@metadata][processMap]>[^\]]+)"
]
}
}
这将导致事件的字段为
"username" => "user@company.com",
"timeCategory" => "TEST: 2. Enterprise - Create Identity Pool"
这是另一个关于grok的回应(但我同意在当时很难维持,在现在也很难理解)
filter{
grok {
match => { "message" => "%{DATE:date}_%{TIME:time} %{CISCO_REASON} \[refId\:%{INT:refId}, lockTimeout:%{INT:lockTimeout}, lockTtl:%{INT:lockTtl}, jobType:%{NOTSPACE:jobType}, lockId:%{NOTSPACE:lockId}, jobTemplateId:%{INT:jobTemplateId}, jobDate:%{INT:jobDate}, userId:%{INT:userId}, customConfig:(\{%{GREEDYDATA:customConfig}\}|null), jobTemplateExecutionId:%{INT:jobTemplateExecutionId}, customInputs:\[customOptions:\[%{GREEDYDATA:customOptions}\]\], processConfig:\[%{GREEDYDATA:processConfig}\], processMap:\[%{GREEDYDATA:processMap}\], taskConfig:\[%{GREEDYDATA:taskConfig}\], :%{NOTSPACE:serial}\]"
}
}
kv {
source => "customOptions"
target => "customOptionsSplitter"
field_split_pattern => ", "
value_split => ":"
}
}
获取您提供的数据集没有提供任何结果。它几乎像是一起通过了过滤器。如果[messages]中的“executing local task”和[tags]中的“beats_morpheus”-这是上面列出的过滤器部分的开始。我按您的方式对它们进行了分解。logstash服务将启动,因此语法似乎正确,并继续运行。我向它发送了3个新日志,没有创建任何新字段。我想我发现了我的错误-如果[messages]中的“executing local task”应该是message。再次测试。好的,这样就解决了问题-我会对customOptions做同样的事情吗-只需设置另一个KV,并添加我想要的已知字段,如TrackingUseCase,如果这是我想要从该KV部分中断的字段?是的,只需使用另一个KV。
filter{
grok {
match => { "message" => "%{DATE:date}_%{TIME:time} %{CISCO_REASON} \[refId\:%{INT:refId}, lockTimeout:%{INT:lockTimeout}, lockTtl:%{INT:lockTtl}, jobType:%{NOTSPACE:jobType}, lockId:%{NOTSPACE:lockId}, jobTemplateId:%{INT:jobTemplateId}, jobDate:%{INT:jobDate}, userId:%{INT:userId}, customConfig:(\{%{GREEDYDATA:customConfig}\}|null), jobTemplateExecutionId:%{INT:jobTemplateExecutionId}, customInputs:\[customOptions:\[%{GREEDYDATA:customOptions}\]\], processConfig:\[%{GREEDYDATA:processConfig}\], processMap:\[%{GREEDYDATA:processMap}\], taskConfig:\[%{GREEDYDATA:taskConfig}\], :%{NOTSPACE:serial}\]"
}
}
kv {
source => "customOptions"
target => "customOptionsSplitter"
field_split_pattern => ", "
value_split => ":"
}
}