elasticsearch 从Xpath创建嵌套字段&;检查现有文件,elasticsearch,logstash,elasticsearch,Logstash" /> elasticsearch 从Xpath创建嵌套字段&;检查现有文件,elasticsearch,logstash,elasticsearch,Logstash" />

elasticsearch 从Xpath创建嵌套字段&;检查现有文件

elasticsearch 从Xpath创建嵌套字段&;检查现有文件,elasticsearch,logstash,elasticsearch,Logstash,我有两个问题, 解析xml数据&将其添加到索引中记录的数组中 检查索引中的现有记录,如果存在,则将该记录的新数据添加到现有记录的数组中 我有一个jdbc输入,它有一个xml列 input { jdbc { .... statement => "SELECT event_xml.... } } 然后是一个xml过滤器来解析数据, 如何使最后3个XPath成为阵列?我需要变异过滤器还是ruby过滤器?我好像弄不明白 filter { xml {

我有两个问题,

  • 解析xml数据&将其添加到索引中记录的数组中

  • 检查索引中的现有记录,如果存在,则将该记录的新数据添加到现有记录的数组中

  • 我有一个jdbc输入,它有一个xml列

    input {
      jdbc {
        ....
        statement => "SELECT event_xml....
      }
    }
    
    然后是一个xml过滤器来解析数据, 如何使最后3个XPath成为阵列?我需要变异过滤器还是ruby过滤器?我好像弄不明白

    filter {  
      xml {       
        source => "event_xml"              
        remove_namespaces => true 
        store_xml => false
        force_array => false
        xpath => [ "/CaseNumber/text()", "case_number" ]
        xpath => [ "/FormName/text()", "[conversations][form_name]" ]
        xpath => [ "/EventDate/text()", "[conversations][event_date]" ]
        xpath => [ "/CaseNote/text()", "[conversations][case_note]" ]
      }
    }
    
    所以在弹性搜索中,它会是这样的

    {
        "case_number" : "12345",
        "conversations" :
            [
                {
                    "form_name" : "form1",
                    "event_date" : "2019-01-09T00:00:00Z",
                    "case_note" : "this is a case note"
                }
            ]                
    }
    
    所以第二个问题是,如果已经有一个唯一的case_编号“12345”,而不是为此创建一个新记录,那么将新的xml值添加到conversations数组中。看起来是这样的

    {
        "case_number" : "12345",
        "conversations" : [
            {
                "form_name" : "form1",
                "event_date" : "2019-01-09T00:00:00Z",
                "case_note" : "this is a case note"
            },
            {
                "form_name" : "form2",
                "event_date" : "2019-05-09T00:00:00Z",
                "case_note" : "this is another case note"
            }
        ]                
    }
    
    我的输出过滤器

    output {
          elasticsearch {
            hosts => ["http://localhost:9200"]
            index => "cases"  
            manage_template => false
          }
        }
    

    这可能吗?谢谢这个ruby过滤器创建了这个数组

    ruby {
        code => '
            event.set("conversations", [Hash[
              "publish_event_id", event.get("publish_event_id"),
              "form_name", event.get("form_name"),
              "event_date", event.get("event_date"),
              "case_note", event.get("case_note")
            ]])
          '
      }
    
    因为输出由

    output {
      elasticsearch {
        hosts => ["http://localhost:9200"]
        index => "cases"  
        document_id => "%{case_number}"
        action => "update"
        doc_as_upsert => true
        script => "     
                    boolean recordExists = false;                                                        
                    for (int i = 0; i < ctx._source.conversations.length; i++) 
                    {                  
                        if(ctx._source.conversations[i].publish_event_id == params.event.get('conversations')[0].publish_event_id)
                        {
                            recordExists = true;
                        }                  
                    }     
                    if(!recordExists){
                        ctx._source.conversations.add(params.event.get('conversations')[0]); 
                    }
                  "
        manage_template => false
      }
    }
    
    输出{
    弹性搜索{
    主机=>[”http://localhost:9200"]
    索引=>“案例”
    文档\u id=>“%{case\u number}”
    操作=>“更新”
    doc\u as\u upsert=>true
    脚本=>”
    boolean recordExists=false;
    for(int i=0;ifalse
    }
    }