elasticsearch,logstash,logstash-configuration,Json,elasticsearch,Logstash,Logstash Configuration" /> elasticsearch,logstash,logstash-configuration,Json,elasticsearch,Logstash,Logstash Configuration" />

Logstash-将嵌套JSON导入Elasticsearch

Logstash-将嵌套JSON导入Elasticsearch,json,elasticsearch,logstash,logstash-configuration,Json,elasticsearch,Logstash,Logstash Configuration,我有大量(~40000)嵌套JSON对象要插入到elasticsearch索引中 JSON对象的结构如下所示: { "customerid": "10932" "date": "16.08.2006", "bez": "xyz", "birthdate": "21.05.1990", "clientid": "2", "address": [ { "addressid": "1",

我有大量(~40000)嵌套JSON对象要插入到elasticsearch索引中

JSON对象的结构如下所示:

    {
    "customerid": "10932"
    "date": "16.08.2006",
    "bez": "xyz",
    "birthdate": "21.05.1990",
    "clientid": "2",
    "address": [
        {
            "addressid": "1",
            "tile": "Mr",
            "street": "main str",
            "valid_to": "21.05.1990",
            "valid_from": "21.05.1990",
        },
        {
            "addressid": "2",
            "title": "Mr",
            "street": "melrose place",
            "valid_to": "21.05.1990",
            "valid_from": "21.05.1990",
        }
      ]
    }
因此,JSON字段(本例中的地址)可以有一个JSON对象数组

将类似这样的JSON文件/对象导入elasticsearch时,logstash配置会是什么样子?该索引的elasticsearch映射应该与JSON的结构类似。elasticsearch文档id应设置为
customerid

input {
  stdin {
    id => "JSON_TEST"
  } 
}
filter {
    json{
        source => "customerid"
        ....
        ....    
    }

}
output {
       stdout{}
       elasticsearch {
          hosts => "https://localhost:9200/"
          index => "customers"           
          document_id => "%{customerid}"
       }                                               
}

如果您可以控制生成的内容,最简单的方法是将输入格式化为单行json,然后使用
json\u行
codec

只需将您的
stdin
更改为:

stdin { codec => "json_lines" }
然后它就会起作用:

cat input_file.json | logstash -f json_input.conf
其中input_file.json有如下行:

{"customerid":1,"nested": {"json":"here"}}
{"customerid":2,"nested": {"json":"there"}}
这样就不需要
json
过滤器了