elasticsearch 按字段关联麋鹿中的消息,elasticsearch,logstash,kibana,kibana-4,elastic-stack,elasticsearch,Logstash,Kibana,Kibana 4,Elastic Stack" /> elasticsearch 按字段关联麋鹿中的消息,elasticsearch,logstash,kibana,kibana-4,elastic-stack,elasticsearch,Logstash,Kibana,Kibana 4,Elastic Stack" />

elasticsearch 按字段关联麋鹿中的消息

elasticsearch 按字段关联麋鹿中的消息,elasticsearch,logstash,kibana,kibana-4,elastic-stack,elasticsearch,Logstash,Kibana,Kibana 4,Elastic Stack,有关: 我们正在设置麋鹿,并希望在Kibana 4中创建可视化。 这里的问题是,我们希望将两种不同类型的消息联系起来 要简化: 消息类型1字段:消息类型、公共id号、字节数、, 消息类型2字段:消息类型、通用id号、主机名 在elasticsearch中,两条消息共享相同的索引 正如你所看到的,我们试图在不考虑公共id的情况下绘制图表,但似乎我们必须使用它。不过,我们还不知道怎么做 有什么帮助吗 编辑 以下是ES模板中的相关字段定义: "URIHost" : {

有关:

我们正在设置麋鹿,并希望在Kibana 4中创建可视化。 这里的问题是,我们希望将两种不同类型的消息联系起来

要简化:

  • 消息类型1字段:消息类型、公共id号、字节数、,
  • 消息类型2字段:消息类型、通用id号、主机名
在elasticsearch中,两条消息共享相同的索引

正如你所看到的,我们试图在不考虑公共id的情况下绘制图表,但似乎我们必须使用它。不过,我们还不知道怎么做

有什么帮助吗

编辑

以下是ES模板中的相关字段定义:

      "URIHost" : {
        "type" : "string",
        "norms" : {
          "enabled" : false
        },
        "fields" : {
          "raw" : {
            "type" : "string",
            "index" : "not_analyzed",
            "ignore_above" : 256
          }
        }
      },
      "Type" : {
        "type" : "string",
        "norms" : {
          "enabled" : false
        },
        "fields" : {
          "raw" : {
            "type" : "string",
            "index" : "not_analyzed",
            "ignore_above" : 256
          }
        }
      },
      "SessionID" : {
        "type" : "long"
      },
      "Bytes" : {
        "type" : "long"
      },
      "BytesReceived" : {
        "type" : "long"
      },
      "BytesSent" : {
        "type" : "long"
      },
这是一个流量类型,已编辑文档:

{
  "_index": "logstash-2015.11.05",
  "_type": "paloalto",
  "_id": "AVDZqdBjpQiRid-uxPjE",
  "_score": null,
  "_source": {
    "@version": "1",
    "@timestamp": "2015-11-05T21:59:55.543Z",
    "syslog_severity_code": 5,
    "syslog_facility_code": 1,
    "syslog_timestamp": "Nov  5 22:59:58",
    "Type": "TRAFFIC",
    "SessionID": 21713,
    "Bytes": 939,
    "BytesSent": 480,
    "BytesReceived": 459,
  },
  "fields": {
    "@timestamp": [
      1446760795543
    ]
  },
  "sort": [
    1446760795543
  ]
}
{
  "_index": "logstash-2015.11.05",
  "_type": "paloalto",
  "_id": "AVDZqVNIpQiRid-uxPjC",
  "_score": null,
  "_source": {
    "@version": "1",
    "@timestamp": "2015-11-05T21:59:23.440Z",
    "syslog_severity_code": 5,
    "syslog_facility_code": 1,
    "syslog_timestamp": "Nov  5 22:59:26",
    "Type": "THREAT",
    "SessionID": 21713,
    "URIHost": "whatever.nevermind.com",
    "URIPath": "/connectiontest.html"
  },
  "fields": {
    "@timestamp": [
      1446760763440
    ]
  },
  "sort": [
    1446760763440
  ]
}
这是一份威胁类型的文件:

{
  "_index": "logstash-2015.11.05",
  "_type": "paloalto",
  "_id": "AVDZqdBjpQiRid-uxPjE",
  "_score": null,
  "_source": {
    "@version": "1",
    "@timestamp": "2015-11-05T21:59:55.543Z",
    "syslog_severity_code": 5,
    "syslog_facility_code": 1,
    "syslog_timestamp": "Nov  5 22:59:58",
    "Type": "TRAFFIC",
    "SessionID": 21713,
    "Bytes": 939,
    "BytesSent": 480,
    "BytesReceived": 459,
  },
  "fields": {
    "@timestamp": [
      1446760795543
    ]
  },
  "sort": [
    1446760795543
  ]
}
{
  "_index": "logstash-2015.11.05",
  "_type": "paloalto",
  "_id": "AVDZqVNIpQiRid-uxPjC",
  "_score": null,
  "_source": {
    "@version": "1",
    "@timestamp": "2015-11-05T21:59:23.440Z",
    "syslog_severity_code": 5,
    "syslog_facility_code": 1,
    "syslog_timestamp": "Nov  5 22:59:26",
    "Type": "THREAT",
    "SessionID": 21713,
    "URIHost": "whatever.nevermind.com",
    "URIPath": "/connectiontest.html"
  },
  "fields": {
    "@timestamp": [
      1446760763440
    ]
  },
  "sort": [
    1446760763440
  ]
}
这是日志存储“过滤器”配置:

filter {
    if [type] == "paloalto" {
        syslog_pri {
            remove_field => [ "syslog_facility", "syslog_severity" ]
        }

        grok {
            match => {
                "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{HOSTNAME:hostname} %{INT},%{YEAR}/%{MONTHNUM}/%{MONTHDAY} %{TIME},%{INT},%{WORD:Type},%{GREEDYDATA:log}"
            }
            remove_field => [ "message" ]
        }

        if [Type] == "THREAT" {
            csv {
                source => "log"
                columns => [ "Threat_OR_ContentType", "ConfigVersion", "GenerateTime", "SourceAddress", "DestinationAddress", "NATSourceIP", "NATDestinationIP", "Rule", "SourceUser", "DestinationUser", "Application", "VirtualSystem", "SourceZone", "DestinationZone", "InboundInterface", "OutboundInterface", "LogAction", "TimeLogged", "SessionID", "RepeatCount", "SourcePort", "DestinationPort", "NATSourcePort", "NATDestinationPort", "Flags", "IPProtocol", "Action", "URL", "Threat_OR_ContentName", "reportid", "Category", "Severity", "Direction", "seqno", "actionflags", "SourceCountry", "DestinationCountry", "cpadding", "contenttype", "pcap_id", "filedigest", "cloud", "url_idx", "user_agent", "filetype", "xff", "referer", "sender", "subject", "recipient" ]
                remove_field => [ "log" ]
            }
            mutate {
                convert => {
                    "SessionID" => "integer"
                    "SourcePort" => "integer"
                    "DestinationPort" => "integer"
                    "NATSourcePort" => "integer"
                    "NATDestinationPort" => "integer"
                }
                remove_field => [ "ConfigVersion", "GenerateTime", "VirtualSystem", "InboundInterface", "OutboundInterface", "LogAction", "TimeLogged", "RepeatCount", "Flags", "Action", "reportid", "Severity", "seqno", "actionflags", "cpadding", "pcap_id", "filedigest", "recipient" ]
            }
            grok {
                match => {
                    "URL" => "%{URIHOST:URIHost}%{URIPATH:URIPath}(%{URIPARAM:URIParam})?"
                }
                remove_field => [ "URL" ]
            }
        }

        else if [Type] == "TRAFFIC" {
            csv {
                source => "log"
                columns => [ "Threat_OR_ContentType", "ConfigVersion", "GenerateTime", "SourceAddress", "DestinationAddress", "NATSourceIP", "NATDestinationIP", "Rule", "SourceUser", "DestinationUser", "Application", "VirtualSystem", "SourceZone", "DestinationZone", "InboundInterface", "OutboundInterface", "LogAction", "TimeLogged", "SessionID", "RepeatCount", "SourcePort", "DestinationPort", "NATSourcePort", "NATDestinationPort", "Flags", "IPProtocol", "Action", "Bytes", "BytesSent", "BytesReceived", "Packets", "StartTime", "ElapsedTimeInSecs", "Category", "Padding", "seqno", "actionflags", "SourceCountry", "DestinationCountry", "cpadding", "pkts_sent", "pkts_received", "session_end_reason" ]
                remove_field => [ "log" ]
            }
            mutate {
                convert => {
                    "SessionID" => "integer"
                    "SourcePort" => "integer"
                    "DestinationPort" => "integer"
                    "NATSourcePort" => "integer"
                    "NATDestinationPort" => "integer"
                    "Bytes" => "integer"
                    "BytesSent" => "integer"
                    "BytesReceived" => "integer"
                    "ElapsedTimeInSecs" => "integer"
                }
                remove_field => [ "ConfigVersion", "GenerateTime", "VirtualSystem", "InboundInterface", "OutboundInterface", "LogAction", "TimeLogged", "RepeatCount", "Flags", "Action", "Packets", "StartTime", "seqno", "actionflags", "cpadding", "pcap_id", "filedigest", "recipient" ]
            }
        }

        date {
            match => [ "syslog_timastamp", "MMM  d HH:mm:ss", "MMM dd HH:mm:ss" ]
            timezone => "CET"
            remove_field => [ "syslog_timestamp" ]
        }
    }
}
我们试图做的是将URIHost术语可视化为X轴和字节,将ByTeSent和BytesReceived和可视化为Y轴。

我认为您可以使用来执行任务。
aggregate
过滤器支持基于公共字段值将多个日志行聚合为单个事件。在您的例子中,我们将使用的公共字段是
SessionID
字段

然后,我们需要另一个字段来检测第一个事件和应该聚合的第二个/最后一个事件。在您的情况下,这将是
类型
字段

您需要按如下方式更改当前配置:

filter {

    ... all other filters

    if [Type] == "THREAT" {
        ... all other filters

        aggregate {
            task_id => "%{SessionID}"
            code => "map['URIHost'] = event['URIHost']; map['URIPath'] = event['URIPath']"
        }
    }

    else if [Type] == "TRAFFIC" {
        ... all other filters

        aggregate {
            task_id => "%{SessionID}"
            code => "event['URIHost'] = map['URIHost']; event['URIPath'] = map['URIPath']"
            end_of_task => true
            timeout => 120
        }
    }
}
一般的想法是,当Logstash遇到
威胁
日志时,它将在内存中的事件映射中临时存储
URIHost
URIPath
,然后当
流量
日志进入时,
URIHost
URIPath
字段将添加到事件中。如果需要,也可以复制其他字段。您还可以根据上一次
威胁
事件之后预期
流量
事件进入的时间调整超时(以秒为单位)


最后,您将获得包含从
THREAT
TRAFFIC
日志行合并的数据的文档,并且您可以轻松创建可视化,显示每个
URIHost
的字节数,如屏幕截图所示。

消息类型字段的值是什么,以及“类型1”消息是否总是出现在前面“类型2”的一个?你能分享你现有的日志存储配置吗,这样人们就不会猜测你的设置吗?“正如你所见”的意思是“请盯着我的屏幕截图,试着对我想做的事情进行反向工程”。你能更好地描述elasticsearch中存在的数据吗(可能是一个带有真实示例的表)以及您希望如何显示这些数据(向我们展示如何将其组合,等等,仅在一个表格中),然后更好地描述可视化数据的问题。@Val:一个或多个“威胁”类型的消息应该出现在单个“流量”之前“键入MessageThank,总之,您有1+个威胁日志和1个结束流量日志,它们共享相同的
会话ID
?对吗?如果您有两个或多个威胁日志,是否还要将它们聚合在一起?我在分析中省略了URIPath,以使URIHost更突出。非常感谢。