Logstash 用于在Kibana仪表板中创建单独部分的Grok模式
很久以来,我一直在尝试使用logstash从自定义日志中提取和标记数据,但没有取得任何进展,我有一个自定义的haproxy日志,如下所示:Logstash 用于在Kibana仪表板中创建单独部分的Grok模式,logstash,kibana,elk,Logstash,Kibana,Elk,很久以来,我一直在尝试使用logstash从自定义日志中提取和标记数据,但没有取得任何进展,我有一个自定义的haproxy日志,如下所示: Feb 22 21:17:32 ap haproxy[1235]: 10.172.80.45:32071 10.31.33.34:44541 10.31.33.34:32772 13.127.229.72:443 [22/Feb/2020:21:17:32.006] this_machine~ backend_test-tui/test-tui_32772
Feb 22 21:17:32 ap haproxy[1235]: 10.172.80.45:32071 10.31.33.34:44541 10.31.33.34:32772 13.127.229.72:443 [22/Feb/2020:21:17:32.006] this_machine~ backend_test-tui/test-tui_32772 40/0/5/1/836 200 701381 - - ---- 0/0/0/0/0 0/0 {testtui.net} {cache_hit} "GET /ob/720/output00007.ts HTTP/1.1"
我想从日志中提取并标记kibana dashboard中的特定内容,如:
- 从“40/0/5/1/836”部分,我只想将最后一个部分数字(836)标记为“响应时间”
- “701381”作为“响应字节”
- “/ob/720/output00007.ts”作为“内容\u url”
- 并且希望在日志文件中使用时间戳,而不是默认的时间戳
{
"@version" => "1",
"message" => "Mar 8 13:53:59 ap haproxy[22158]: 10.172.80.45:30835 10.31.33.34:57886 10.31.33.34:32771 43.252.91.147:443 [08/Mar/2020:13:53:59.827] this_machine~ backend_noida/noida_32771 55/0/1/0/145 200 2146931 - - ---- 0/0/0/0/0 0/0 {testalef1.adcontentamtsolutions.} {cache_hit} \"GET /felaapp/virtual_videos/og/1080/output00006.ts HTTP/1.1\"",
"@timestamp" => 2020-03-08T10:24:07.348Z,
"path" => "/home/alef/haproxy.log",
"host" => "com1",
"tags" => [
[0] "_grokparsefailure"
]
}
下面是我创建的过滤器
%{MONTH:[Month]} %{MONTHDAY:[date]} %{TIME:[time]} %{WORD:[source]} %{WORD:[app]}\[%{DATA:[class]}\]: %{IPORHOST:[UE_IP]}:%{NUMBER:[UE_Port]} %{IPORHOST:[NATTED_IP]}:%{NUMBER:[NATTED_Source_Port]} %{IPORHOST:[NATTED_IP]}:%{NUMBER:[NATTED_Destination_Port]} %{IPORHOST:[WAN_IP]}:%{NUMBER:[WAN_Port]} \[%{HAPROXYDATE:[accept_date]}\] %{NOTSPACE:[frontend_name]}~ %{NOTSPACE:[backend_name]} %{NOTSPACE:[ty_name]}/%{NUMBER:[response_time]} %{NUMBER:[http_status_code]} %{INT:[response_bytes]} - - ---- %{NOTSPACE:[df]} %{NOTSPACE:[df]} %{DATA:[domain_name]} %{DATA:[cache_status]} %{DATA:[domain_name]} %{NOTSPACE:[content]} HTTP/%{NUMBER:[http_version]}
下面是我的日志存储配置文件:
input {
beats {
port => 5044
}
}
filter {
grok {
match => { "message" => "%{MONTH:[Month]} %{MONTHDAY:[date]} %{TIME:[time]} %{WORD:[source]} %{WORD:[app]}\[%{DATA:[class]}\]: %{IPORHOST:[UE_IP]}:%{NUMBER:[UE_Port]} %{IPORHOST:[NATTED_IP]}:%{NUMBER:[NATTED_Source_Port]} %{IPORHOST:[NATTED_IP]}:%{NUMBER:[NATTED_Destination_Port]} %{IPORHOST:[WAN_IP]}:%{NUMBER:[WAN_Port]} \[%{HAPROXYDATE:[accept_date]}\] %{NOTSPACE:[frontend_name]}~ %{NOTSPACE:[backend_name]} %{NOTSPACE:[ty_name]}/%{NUMBER:[response_time]} %{NUMBER:[http_status_code]} %{INT:[response_bytes]} - - ---- %{NOTSPACE:[df]} %{NOTSPACE:[df]} %{DATA:[domain_name]} %{DATA:[cache_status]} %{DATA:[domain_name]} %{NOTSPACE:[content]} HTTP/%{NUMBER:[http_version]} " }
}
date {
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
}
}
output {
elasticsearch { hosts => ["localhost:9200"] }
}
使用下面的筛选器解决了我的问题,我必须在日志存储本身中进行调试,以获得正确的筛选器: 输入{beats{ 端口=>5044} 过滤器{grok{ match=>{“message”=>“%{MONTH:MONTH}%{MONTHDAY:date}%{TIME:TIME}%{WORD:[source]}%{WORD:[app]}[%{DATA:[class]}]: %{IPORHOST:[UE_IP]}:%{NUMBER:[UE_端口]} %{IPORHOST:[NATTED_IP]}:%{NUMBER:[NATTED_Source_Port]} %{IPORHOST:[NATTED_IP]}:%{NUMBER:[NATTED_Destination_Port]} %{IPORHOST:[WAN_IP]}:%{NUMBER:[WAN_Port]} [%{HAPROXYDATE:[接受日期]}]{NOTSPACE:[前端名称]}~ %{NOTSPACE:[backend_name]} %{NOTSPACE:[ty_name]}/%{NUMBER:[response_time]:int} %{NUMBER:[http\u status\u code]}%{NUMBER:[response\u bytes]:int}------ %{NOTSPACE:[df]}%{NOTSPACE:[df]}%{DATA:[域名]} %{DATA:[cache_status]}%{DATA:[domain_name]}%{URIPATHPARAM:[content]} HTTP/%{NUMBER:[HTTP_version]} 添加标签=>[“响应时间”,“响应时间”] }日期{ 匹配=>[“时间戳”,“dd/MMM/yyyy:HH:mm:ss Z”]} 输出{elasticsearch{hosts=>[“localhost:9200”]} stdout{ 编解码器=>rubydebug } }