elasticsearch 按字段关联麋鹿中的消息
有关: 我们正在设置麋鹿,并希望在Kibana 4中创建可视化。 这里的问题是,我们希望将两种不同类型的消息联系起来 要简化:elasticsearch 按字段关联麋鹿中的消息,elasticsearch,logstash,kibana,kibana-4,elastic-stack,elasticsearch,Logstash,Kibana,Kibana 4,Elastic Stack,有关: 我们正在设置麋鹿,并希望在Kibana 4中创建可视化。 这里的问题是,我们希望将两种不同类型的消息联系起来 要简化: 消息类型1字段:消息类型、公共id号、字节数、, 消息类型2字段:消息类型、通用id号、主机名 在elasticsearch中,两条消息共享相同的索引 正如你所看到的,我们试图在不考虑公共id的情况下绘制图表,但似乎我们必须使用它。不过,我们还不知道怎么做 有什么帮助吗 编辑 以下是ES模板中的相关字段定义: "URIHost" : {
- 消息类型1字段:消息类型、公共id号、字节数、,
- 消息类型2字段:消息类型、通用id号、主机名
"URIHost" : {
"type" : "string",
"norms" : {
"enabled" : false
},
"fields" : {
"raw" : {
"type" : "string",
"index" : "not_analyzed",
"ignore_above" : 256
}
}
},
"Type" : {
"type" : "string",
"norms" : {
"enabled" : false
},
"fields" : {
"raw" : {
"type" : "string",
"index" : "not_analyzed",
"ignore_above" : 256
}
}
},
"SessionID" : {
"type" : "long"
},
"Bytes" : {
"type" : "long"
},
"BytesReceived" : {
"type" : "long"
},
"BytesSent" : {
"type" : "long"
},
这是一个流量类型,已编辑文档:
{
"_index": "logstash-2015.11.05",
"_type": "paloalto",
"_id": "AVDZqdBjpQiRid-uxPjE",
"_score": null,
"_source": {
"@version": "1",
"@timestamp": "2015-11-05T21:59:55.543Z",
"syslog_severity_code": 5,
"syslog_facility_code": 1,
"syslog_timestamp": "Nov 5 22:59:58",
"Type": "TRAFFIC",
"SessionID": 21713,
"Bytes": 939,
"BytesSent": 480,
"BytesReceived": 459,
},
"fields": {
"@timestamp": [
1446760795543
]
},
"sort": [
1446760795543
]
}
{
"_index": "logstash-2015.11.05",
"_type": "paloalto",
"_id": "AVDZqVNIpQiRid-uxPjC",
"_score": null,
"_source": {
"@version": "1",
"@timestamp": "2015-11-05T21:59:23.440Z",
"syslog_severity_code": 5,
"syslog_facility_code": 1,
"syslog_timestamp": "Nov 5 22:59:26",
"Type": "THREAT",
"SessionID": 21713,
"URIHost": "whatever.nevermind.com",
"URIPath": "/connectiontest.html"
},
"fields": {
"@timestamp": [
1446760763440
]
},
"sort": [
1446760763440
]
}
这是一份威胁类型的文件:
{
"_index": "logstash-2015.11.05",
"_type": "paloalto",
"_id": "AVDZqdBjpQiRid-uxPjE",
"_score": null,
"_source": {
"@version": "1",
"@timestamp": "2015-11-05T21:59:55.543Z",
"syslog_severity_code": 5,
"syslog_facility_code": 1,
"syslog_timestamp": "Nov 5 22:59:58",
"Type": "TRAFFIC",
"SessionID": 21713,
"Bytes": 939,
"BytesSent": 480,
"BytesReceived": 459,
},
"fields": {
"@timestamp": [
1446760795543
]
},
"sort": [
1446760795543
]
}
{
"_index": "logstash-2015.11.05",
"_type": "paloalto",
"_id": "AVDZqVNIpQiRid-uxPjC",
"_score": null,
"_source": {
"@version": "1",
"@timestamp": "2015-11-05T21:59:23.440Z",
"syslog_severity_code": 5,
"syslog_facility_code": 1,
"syslog_timestamp": "Nov 5 22:59:26",
"Type": "THREAT",
"SessionID": 21713,
"URIHost": "whatever.nevermind.com",
"URIPath": "/connectiontest.html"
},
"fields": {
"@timestamp": [
1446760763440
]
},
"sort": [
1446760763440
]
}
这是日志存储“过滤器”配置:
filter {
if [type] == "paloalto" {
syslog_pri {
remove_field => [ "syslog_facility", "syslog_severity" ]
}
grok {
match => {
"message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{HOSTNAME:hostname} %{INT},%{YEAR}/%{MONTHNUM}/%{MONTHDAY} %{TIME},%{INT},%{WORD:Type},%{GREEDYDATA:log}"
}
remove_field => [ "message" ]
}
if [Type] == "THREAT" {
csv {
source => "log"
columns => [ "Threat_OR_ContentType", "ConfigVersion", "GenerateTime", "SourceAddress", "DestinationAddress", "NATSourceIP", "NATDestinationIP", "Rule", "SourceUser", "DestinationUser", "Application", "VirtualSystem", "SourceZone", "DestinationZone", "InboundInterface", "OutboundInterface", "LogAction", "TimeLogged", "SessionID", "RepeatCount", "SourcePort", "DestinationPort", "NATSourcePort", "NATDestinationPort", "Flags", "IPProtocol", "Action", "URL", "Threat_OR_ContentName", "reportid", "Category", "Severity", "Direction", "seqno", "actionflags", "SourceCountry", "DestinationCountry", "cpadding", "contenttype", "pcap_id", "filedigest", "cloud", "url_idx", "user_agent", "filetype", "xff", "referer", "sender", "subject", "recipient" ]
remove_field => [ "log" ]
}
mutate {
convert => {
"SessionID" => "integer"
"SourcePort" => "integer"
"DestinationPort" => "integer"
"NATSourcePort" => "integer"
"NATDestinationPort" => "integer"
}
remove_field => [ "ConfigVersion", "GenerateTime", "VirtualSystem", "InboundInterface", "OutboundInterface", "LogAction", "TimeLogged", "RepeatCount", "Flags", "Action", "reportid", "Severity", "seqno", "actionflags", "cpadding", "pcap_id", "filedigest", "recipient" ]
}
grok {
match => {
"URL" => "%{URIHOST:URIHost}%{URIPATH:URIPath}(%{URIPARAM:URIParam})?"
}
remove_field => [ "URL" ]
}
}
else if [Type] == "TRAFFIC" {
csv {
source => "log"
columns => [ "Threat_OR_ContentType", "ConfigVersion", "GenerateTime", "SourceAddress", "DestinationAddress", "NATSourceIP", "NATDestinationIP", "Rule", "SourceUser", "DestinationUser", "Application", "VirtualSystem", "SourceZone", "DestinationZone", "InboundInterface", "OutboundInterface", "LogAction", "TimeLogged", "SessionID", "RepeatCount", "SourcePort", "DestinationPort", "NATSourcePort", "NATDestinationPort", "Flags", "IPProtocol", "Action", "Bytes", "BytesSent", "BytesReceived", "Packets", "StartTime", "ElapsedTimeInSecs", "Category", "Padding", "seqno", "actionflags", "SourceCountry", "DestinationCountry", "cpadding", "pkts_sent", "pkts_received", "session_end_reason" ]
remove_field => [ "log" ]
}
mutate {
convert => {
"SessionID" => "integer"
"SourcePort" => "integer"
"DestinationPort" => "integer"
"NATSourcePort" => "integer"
"NATDestinationPort" => "integer"
"Bytes" => "integer"
"BytesSent" => "integer"
"BytesReceived" => "integer"
"ElapsedTimeInSecs" => "integer"
}
remove_field => [ "ConfigVersion", "GenerateTime", "VirtualSystem", "InboundInterface", "OutboundInterface", "LogAction", "TimeLogged", "RepeatCount", "Flags", "Action", "Packets", "StartTime", "seqno", "actionflags", "cpadding", "pcap_id", "filedigest", "recipient" ]
}
}
date {
match => [ "syslog_timastamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
timezone => "CET"
remove_field => [ "syslog_timestamp" ]
}
}
}
我们试图做的是将URIHost术语可视化为X轴和字节,将ByTeSent和BytesReceived和可视化为Y轴。我认为您可以使用来执行任务。aggregate
过滤器支持基于公共字段值将多个日志行聚合为单个事件。在您的例子中,我们将使用的公共字段是SessionID
字段
然后,我们需要另一个字段来检测第一个事件和应该聚合的第二个/最后一个事件。在您的情况下,这将是类型
字段
您需要按如下方式更改当前配置:
filter {
... all other filters
if [Type] == "THREAT" {
... all other filters
aggregate {
task_id => "%{SessionID}"
code => "map['URIHost'] = event['URIHost']; map['URIPath'] = event['URIPath']"
}
}
else if [Type] == "TRAFFIC" {
... all other filters
aggregate {
task_id => "%{SessionID}"
code => "event['URIHost'] = map['URIHost']; event['URIPath'] = map['URIPath']"
end_of_task => true
timeout => 120
}
}
}
一般的想法是,当Logstash遇到威胁
日志时,它将在内存中的事件映射中临时存储URIHost
和URIPath
,然后当流量
日志进入时,URIHost
和URIPath
字段将添加到事件中。如果需要,也可以复制其他字段。您还可以根据上一次威胁
事件之后预期流量
事件进入的时间调整超时(以秒为单位)
最后,您将获得包含从
THREAT
和TRAFFIC
日志行合并的数据的文档,并且您可以轻松创建可视化,显示每个URIHost
的字节数,如屏幕截图所示。消息类型字段的值是什么,以及“类型1”消息是否总是出现在前面“类型2”的一个?你能分享你现有的日志存储配置吗,这样人们就不会猜测你的设置吗?“正如你所见”的意思是“请盯着我的屏幕截图,试着对我想做的事情进行反向工程”。你能更好地描述elasticsearch中存在的数据吗(可能是一个带有真实示例的表)以及您希望如何显示这些数据(向我们展示如何将其组合,等等,仅在一个表格中),然后更好地描述可视化数据的问题。@Val:一个或多个“威胁”类型的消息应该出现在单个“流量”之前“键入MessageThank,总之,您有1+个威胁日志和1个结束流量日志,它们共享相同的会话ID
?对吗?如果您有两个或多个威胁日志,是否还要将它们聚合在一起?我在分析中省略了URIPath,以使URIHost更突出。非常感谢。