elasticsearch 如何通过LogStash过滤简单消息,将消息划分为多个字段
这是输入文件:elasticsearch 如何通过LogStash过滤简单消息,将消息划分为多个字段,elasticsearch,logstash,logstash-grok,logstash-configuration,logstash-file,elasticsearch,Logstash,Logstash Grok,Logstash Configuration,Logstash File,这是输入文件: {"meta":"","level":"error","message":"clientErrorHandler: Erro não previsto ou mapeado durante chamada dos serviços.","timestamp":"2017-04-06T16:08:37.861Z"} {"meta":"","level":"error","message":"clientErrorHandler: Erro não previsto ou mapea
{"meta":"","level":"error","message":"clientErrorHandler: Erro não previsto ou mapeado durante chamada dos serviços.","timestamp":"2017-04-06T16:08:37.861Z"}
{"meta":"","level":"error","message":"clientErrorHandler: Erro não previsto ou mapeado durante chamada dos serviços.","timestamp":"2017-04-06T19:40:17.682Z"}
基本上,这样的日志是我的NodeJs应用程序通过Winstom模块生成的。我的疑问集中在如何调整logstash过滤器,以便在ElasticSearch中创建4个字段
我的意图是查看“列”(属性或文件在ElasticSearch上下文中可能更好):级别(例如错误)、消息源(例如clientErrorHandler)、消息内容(例如Erro não…serviços)和错误时间(例如2017-04-06T19:40:17)
我被这一点困住了:
1-我使用了这个logstash.conf
input {
file {
path => "/home/demetrio/dev/testes_manuais/ELK/logs/*"
start_position => "beginning"
}
}
filter {
grok {
match => {
"message" => '%{SYSLOG5424SD:loglevel} %{TIMESTAMP_ISO8601:Date} %{GREEDYDATA:content}'
}
}
date {
match => [ "Date", "YYYY-mm-dd HH:mm:ss.SSS" ]
locale => en
}
}
output {
stdout {
codec => plain {
charset => "ISO-8859-1"
}
}
elasticsearch {
hosts => "http://127.0.0.1:9200"
index => "dmz-logs-indice"
}
}
2-通过Kibana开发工具搜索ElasticSearch
GET _search
{
"query": {
"match_all": {}
}
}
我看到:
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 6,
"successful": 6,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": ".kibana",
"_type": "config",
"_id": "5.3.0",
"_score": 1,
"_source": {
"buildNum": 14823
}
},
{
"_index": "dmz-logs-indice",
"_type": "logs",
"_id": "AVtJLZ5x6gscWn5fxxA_",
"_score": 1,
"_source": {
"path": "/home/demetrio/dev/testes_manuais/ELK/logs/logs.log",
"@timestamp": "2017-04-07T16:09:36.996Z",
"@version": "1",
"host": "nodejs",
"message": """{"meta":"","level":"error","message":"clientErrorHandler: Erro não previsto ou mapeado durante chamada dos serviços.","timestamp":"2017-04-06T16:08:37.861Z"}""",
"tags": [
"_grokparsefailure"
]
}
},
{
"_index": "dmz-logs-indice",
"_type": "logs",
"_id": "AVtJLZ5x6gscWn5fxxBA",
"_score": 1,
"_source": {
"path": "/home/demetrio/dev/testes_manuais/ELK/logs/logs.log",
"@timestamp": "2017-04-07T16:09:36.998Z",
"@version": "1",
"host": "nodejs",
"message": """{"meta":"","level":"error","message":"clientErrorHandler: Erro não previsto ou mapeado durante chamada dos serviços.","timestamp":"2017-04-06T19:40:17.682Z"}""",
"tags": [
"_grokparsefailure"
]
}
}
]
}
}
我想我应该使用一些常规表达式或Grok来划分四个和平:
1级
2-带有“前面的内容”的消息:
3-消息后面的内容“:”
4-时间戳
如果可能,提供更好的列(字段/属性)标签,如:
1级
2-消息源
3-信息内容
4-错误时间
最后移除时间戳纳秒
注:为了防止将来的读者对我如何登录NodeJs感兴趣,这里是:
PS2:我仔细阅读了一些问题,如但我真的被卡住了,因为应用程序将日志输出为JSON字符串,您可以配置Logstash将日志解析为JSON。这只需将
codec=>“json”
添加到文件输入配置中即可
下面是您场景的配置示例:
input {
file {
path => "/home/demetrio/dev/testes_manuais/ELK/logs/*"
start_position => "beginning"
codec => "json"
}
}
filter {
# This matches `timestamp` field into `@timestamp` field for Kibana to consume.
date {
match => [ "timestamp", "ISO8601" ]
remove_field => [ "timestamp" ]
}
}
output {
stdout {
# This codec gives your more details about the event.
codec => rubydebug
}
elasticsearch {
hosts => "http://127.0.0.1:9200"
index => "dmz-logs-indice"
}
}
{
"path" => "/home/demetrio/dev/testes_manuais/ELK/logs/demo.log",
"@timestamp" => 2017-04-06T19:40:17.682Z,
"level" => "error",
"meta" => "",
"@version" => "1",
"host" => "dbf718c4b8e4",
"message" => "clientErrorHandler: Erro não previsto ou mapeado durante chamada dos serviços.",
}
这是Logstash:
input {
file {
path => "/home/demetrio/dev/testes_manuais/ELK/logs/*"
start_position => "beginning"
codec => "json"
}
}
filter {
# This matches `timestamp` field into `@timestamp` field for Kibana to consume.
date {
match => [ "timestamp", "ISO8601" ]
remove_field => [ "timestamp" ]
}
}
output {
stdout {
# This codec gives your more details about the event.
codec => rubydebug
}
elasticsearch {
hosts => "http://127.0.0.1:9200"
index => "dmz-logs-indice"
}
}
{
"path" => "/home/demetrio/dev/testes_manuais/ELK/logs/demo.log",
"@timestamp" => 2017-04-06T19:40:17.682Z,
"level" => "error",
"meta" => "",
"@version" => "1",
"host" => "dbf718c4b8e4",
"message" => "clientErrorHandler: Erro não previsto ou mapeado durante chamada dos serviços.",
}
谢谢我只是错过了如何完成我问题的这一部分“我的意图是看到“列”(我想在ElasticSearch上下文中属性或文件可能是更好的词):级别(例如错误)、消息源(例如clientErrorHandler)、消息内容(例如Erro não…serviços)和无纳秒的错误时间(例如2017-04-06T19:40:17)”。我的最终意图是将这些消息文本分开。通过您的回答,我在一个字段中获得了整个消息。