在Logstash中解析出笨拙的JSON
下午, 在过去的几周里,我一直在努力解决这个问题,但却找不到解决办法。我们通过第三部分收到一些日志,到目前为止,我已经使用grok将下面的值提取到details字段中。令人烦恼的是,如果不是所有的斜杠,这将是非常简单的 有没有一种简单的方法可以在Logstash中将这些数据解析为JSON在Logstash中解析出笨拙的JSON,json,parsing,elasticsearch,logstash,logstash-grok,Json,Parsing,elasticsearch,Logstash,Logstash Grok,下午, 在过去的几周里,我一直在努力解决这个问题,但却找不到解决办法。我们通过第三部分收到一些日志,到目前为止,我已经使用grok将下面的值提取到details字段中。令人烦恼的是,如果不是所有的斜杠,这将是非常简单的 有没有一种简单的方法可以在Logstash中将这些数据解析为JSON {\"CreationTime\":\"2021-05-11T06:42:44\",\"Id\":\"xxxxxxxxxx-xxxx-xxx
{\"CreationTime\":\"2021-05-11T06:42:44\",\"Id\":\"xxxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx\",\"Operation\":\"SearchMtpBatch\",\"OrganizationId\":\"xxxxxxxxx-xxx-xxxx-xxxx-xxxxxxx\",\"RecordType\":52,\"UserKey\":\"eample@example.onmicrosoft.com\",\"UserType\":5,\"Version\":1,\"Workload\":\"SecurityComplianceCenter\",\"UserId\":\"example@example.onmicrosoft.com\",\"AadAppId\":\"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx\",\"DataType\":\"MtpBatch\",\"DatabaseType\":\"DataInsights\",\"RelativeUrl\":\"/DataInsights/DataInsightsService.svc/Find/MtpBatch?tenantid=xxxxxxx-xxxxx-xxxx-xxx-xxxxxxxx&PageSize=200&Filter=ModelType+eq+1+and+ContainerUrn+eq+%xxurn%xAZappedUrlInvestigation%xxxxxxxxxxxxxxxxxxxxxx%xx\",\"ResultCount\":\"1\"}
您可以通过以下方式轻松实现这一点:
如果您的源数据实际上包含这些反斜杠,那么您需要以某种方式删除它们,然后Logstash才能将消息识别为有效的JSON 您可以在它到达Logstash之前这样做,然后json编解码器可能会按预期工作。或者,如果希望Logstash处理它,可以使用Mutate的
gsub
选项,然后使用JSON过滤器来解析:
过滤器{
变异{
gsub=>[“消息”、“[\\]”、“”]
}
json{
source=>“消息”
}
}
有几件事需要注意:这只会盲目地去掉所有的反斜杠。如果字符串可能包含反斜杠,则需要做一些更复杂的事情。我以前在gsub
中很难避免反斜杠,我发现使用regex任何一种/[]
结构都更安全
这里有一个docker one liner来运行该配置。当使用-e
在命令行上指定config时,stdin输入和stdout输出是默认值,因此为了可读性,我在这里省略了它们:
docker run --rm -it docker.elastic.co/logstash/logstash:7.12.1 -e 'filter { mutate { gsub => ["message", "[\\]", "" ]} json { source => "message" } }'
将示例粘贴到中并点击返回将导致此输出:
{
"@timestamp" => 2021-05-13T01:57:40.736Z,
"RelativeUrl" => "/DataInsights/DataInsightsService.svc/Find/MtpBatch?tenantid=xxxxxxx-xxxxx-xxxx-xxx-xxxxxxxx&PageSize=200&Filter=ModelType+eq+1+and+ContainerUrn+eq+%xxurn%xAZappedUrlInvestigation%xxxxxxxxxxxxxxxxxxxxxx%xx",
"OrganizationId" => "xxxxxxxxx-xxx-xxxx-xxxx-xxxxxxx",
"UserKey" => "eample@example.onmicrosoft.com",
"DataType" => "MtpBatch",
"message" => "{\"CreationTime\":\"2021-05-11T06:42:44\",\"Id\":\"xxxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx\",\"Operation\":\"SearchMtpBatch\",\"OrganizationId\":\"xxxxxxxxx-xxx-xxxx-xxxx-xxxxxxx\",\"RecordType\":52,\"UserKey\":\"eample@example.onmicrosoft.com\",\"UserType\":5,\"Version\":1,\"Workload\":\"SecurityComplianceCenter\",\"UserId\":\"example@example.onmicrosoft.com\",\"AadAppId\":\"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx\",\"DataType\":\"MtpBatch\",\"DatabaseType\":\"DataInsights\",\"RelativeUrl\":\"/DataInsights/DataInsightsService.svc/Find/MtpBatch?tenantid=xxxxxxx-xxxxx-xxxx-xxx-xxxxxxxx&PageSize=200&Filter=ModelType+eq+1+and+ContainerUrn+eq+%xxurn%xAZappedUrlInvestigation%xxxxxxxxxxxxxxxxxxxxxx%xx\",\"ResultCount\":\"1\"}",
"UserType" => 5,
"UserId" => "example@example.onmicrosoft.com",
"type" => "stdin",
"host" => "de2c988c09c7",
"@version" => "1",
"Operation" => "SearchMtpBatch",
"AadAppId" => "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx",
"ResultCount" => "1",
"DatabaseType" => "DataInsights",
"Version" => 1,
"RecordType" => 52,
"CreationTime" => "2021-05-11T06:42:44",
"Id" => "xxxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx",
"Workload" => "SecurityComplianceCenter"
}
谢谢你的回复,问题是我得到了一个错误,因为反斜杠。管道崩溃,说它不是正确的JSON。你能用你的Logstash配置更新你的问题吗?我想这可能是一个json编解码器,你可能需要处理双重转义。。。因此,与其使用logstash查看\\“CreationTime\”
,不如使用\\\“CreationTime\”
。实际的源数据是否包含反斜杠?
{
"@timestamp" => 2021-05-13T01:57:40.736Z,
"RelativeUrl" => "/DataInsights/DataInsightsService.svc/Find/MtpBatch?tenantid=xxxxxxx-xxxxx-xxxx-xxx-xxxxxxxx&PageSize=200&Filter=ModelType+eq+1+and+ContainerUrn+eq+%xxurn%xAZappedUrlInvestigation%xxxxxxxxxxxxxxxxxxxxxx%xx",
"OrganizationId" => "xxxxxxxxx-xxx-xxxx-xxxx-xxxxxxx",
"UserKey" => "eample@example.onmicrosoft.com",
"DataType" => "MtpBatch",
"message" => "{\"CreationTime\":\"2021-05-11T06:42:44\",\"Id\":\"xxxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx\",\"Operation\":\"SearchMtpBatch\",\"OrganizationId\":\"xxxxxxxxx-xxx-xxxx-xxxx-xxxxxxx\",\"RecordType\":52,\"UserKey\":\"eample@example.onmicrosoft.com\",\"UserType\":5,\"Version\":1,\"Workload\":\"SecurityComplianceCenter\",\"UserId\":\"example@example.onmicrosoft.com\",\"AadAppId\":\"xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx\",\"DataType\":\"MtpBatch\",\"DatabaseType\":\"DataInsights\",\"RelativeUrl\":\"/DataInsights/DataInsightsService.svc/Find/MtpBatch?tenantid=xxxxxxx-xxxxx-xxxx-xxx-xxxxxxxx&PageSize=200&Filter=ModelType+eq+1+and+ContainerUrn+eq+%xxurn%xAZappedUrlInvestigation%xxxxxxxxxxxxxxxxxxxxxx%xx\",\"ResultCount\":\"1\"}",
"UserType" => 5,
"UserId" => "example@example.onmicrosoft.com",
"type" => "stdin",
"host" => "de2c988c09c7",
"@version" => "1",
"Operation" => "SearchMtpBatch",
"AadAppId" => "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxx",
"ResultCount" => "1",
"DatabaseType" => "DataInsights",
"Version" => 1,
"RecordType" => 52,
"CreationTime" => "2021-05-11T06:42:44",
"Id" => "xxxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx",
"Workload" => "SecurityComplianceCenter"
}