elasticsearch,logstash,logstash-configuration,Jdbc,elasticsearch,Logstash,Logstash Configuration" /> elasticsearch,logstash,logstash-configuration,Jdbc,elasticsearch,Logstash,Logstash Configuration" />

logstash jdbc上的多个输入

logstash jdbc上的多个输入,jdbc,elasticsearch,logstash,logstash-configuration,Jdbc,elasticsearch,Logstash,Logstash Configuration,我使用LogstashJDBC来保持mysql和elasticsearch之间的同步。一张桌子就行了。但现在我想对多个表执行此操作。我需要在终端中打开多个吗 logstash agent -f /Users/logstash/logstash-jdbc.conf 每个表都有一个select查询,或者我们是否有更好的方法来更新多个表 我的配置文件 input { jdbc { jdbc_driver_library => "/Users/logstash/mysql-con

我使用LogstashJDBC来保持mysql和elasticsearch之间的同步。一张桌子就行了。但现在我想对多个表执行此操作。我需要在终端中打开多个吗

logstash  agent -f /Users/logstash/logstash-jdbc.conf 
每个表都有一个select查询,或者我们是否有更好的方法来更新多个表

我的配置文件

input {
  jdbc {
    jdbc_driver_library => "/Users/logstash/mysql-connector-java-5.1.39-bin.jar"
    jdbc_driver_class => "com.mysql.jdbc.Driver"
    jdbc_connection_string => "jdbc:mysql://localhost:3306/database_name"
    jdbc_user => "root"
    jdbc_password => "password"
    schedule => "* * * * *"
    statement => "select * from table1"
  }
}
output {
    elasticsearch {
        index => "testdb"
        document_type => "table1"
        document_id => "%{table_id}"
        hosts => "localhost:9200"
    }
}

您完全可以使用一个带有多个jdbc输入的配置,然后根据事件来自哪个表,对elasticsearch输出中的索引和文档类型进行参数化

input {
  jdbc {
    jdbc_driver_library => "/Users/logstash/mysql-connector-java-5.1.39-bin.jar"
    jdbc_driver_class => "com.mysql.jdbc.Driver"
    jdbc_connection_string => "jdbc:mysql://localhost:3306/database_name"
    jdbc_user => "root"
    jdbc_password => "password"
    schedule => "* * * * *"
    statement => "select * from table1"
    type => "table1"
  }
  jdbc {
    jdbc_driver_library => "/Users/logstash/mysql-connector-java-5.1.39-bin.jar"
    jdbc_driver_class => "com.mysql.jdbc.Driver"
    jdbc_connection_string => "jdbc:mysql://localhost:3306/database_name"
    jdbc_user => "root"
    jdbc_password => "password"
    schedule => "* * * * *"
    statement => "select * from table2"
    type => "table2"
  }
  # add more jdbc inputs to suit your needs 
}
output {
    elasticsearch {
        index => "testdb"
        document_type => "%{type}"   # <- use the type from each input
        hosts => "localhost:9200"
    }
}

这不会创建重复的数据。和兼容的logstash 6x

# YOUR_DATABASE_NAME : test
# FIRST_TABLE :  place  
# SECOND_TABLE :  things    
# SET_DATA_INDEX : test_index_1, test_index_2

input {
    jdbc {
        # The path to our downloaded jdbc driver
        jdbc_driver_library => "/mysql-connector-java-5.1.44-bin.jar"
        jdbc_driver_class => "com.mysql.jdbc.Driver"
        # Postgres jdbc connection string to our database, YOUR_DATABASE_NAME
        jdbc_connection_string => "jdbc:mysql://localhost:3306/test"
        # The user we wish to execute our statement as
        jdbc_user => "root"
        jdbc_password => ""
        schedule => "* * * * *"
        statement => "SELECT  @slno:=@slno+1 aut_es_1, es_qry_tbl.* FROM (SELECT * FROM `place`) es_qry_tbl, (SELECT @slno:=0) es_tbl"
        type => "place"
        add_field => { "queryFunctionName" => "getAllDataFromFirstTable" }
        use_column_value => true
        tracking_column => "aut_es_1"
    }

    jdbc {
        # The path to our downloaded jdbc driver
        jdbc_driver_library => "/mysql-connector-java-5.1.44-bin.jar"
        jdbc_driver_class => "com.mysql.jdbc.Driver"
        # Postgres jdbc connection string to our database, YOUR_DATABASE_NAME
        jdbc_connection_string => "jdbc:mysql://localhost:3306/test"
        # The user we wish to execute our statement as
        jdbc_user => "root"
        jdbc_password => ""
        schedule => "* * * * *"
        statement => "SELECT  @slno:=@slno+1 aut_es_2, es_qry_tbl.* FROM (SELECT * FROM `things`) es_qry_tbl, (SELECT @slno:=0) es_tbl"
        type => "things"
        add_field => { "queryFunctionName" => "getAllDataFromSecondTable" }
        use_column_value => true
        tracking_column => "aut_es_2"
    } 
}

# install uuid plugin 'bin/logstash-plugin install logstash-filter-uuid'
# The uuid filter allows you to generate a UUID and add it as a field to each processed event.

filter {

    mutate {
            add_field => {
                    "[@metadata][document_id]" => "%{aut_es_1}%{aut_es_2}"
            }
    }

    uuid {
        target    => "uuid"
        overwrite => true
    }    
}

output {
    stdout {codec => rubydebug}
    if [type] == "place" {
        elasticsearch {
            hosts => "localhost:9200"
            index => "test_index_1_12"
            #document_id => "%{aut_es_1}"
            document_id => "%{[@metadata][document_id]}"
        }
    }
    if [type] == "things" {
        elasticsearch {
            hosts => "localhost:9200"
            index => "test_index_2_13"
            document_id => "%{[@metadata][document_id]}"
            # document_id => "%{aut_es_2}"
            # you can set document_id . otherwise ES will genrate unique id. 
        }
    }
}

如果需要在同一进程中运行多个管道,Logstash提供了一种通过名为pipelines.yml的配置文件并使用多个管道来实现这一点的方法

如果当前配置的事件流不共享相同的输入/过滤器和输出,并且使用标记和条件将它们彼此分离,那么使用多个管道尤其有用


您可以使用一个配置和多个jdbc输入,然后根据事件来自哪个表,对elasticsearch输出中的索引和文档类型进行参数化。有任何示例或示例吗?嗯……我认为问题在于文档id=>%{table\u id}除非我能自动生成一个唯一的文档,你能详细说明一下吗?每个表是否有不同的ID字段用作文档ID?是的,因此我有不同的ID,我宁愿使用elasticsearch创建ID,而不是尝试使用现有的mysql ID,然后只需删除文档ID节,ES就会自动生成自己的ID。我已经更新了我的答案。谢谢你…你能告诉我如何在屏幕上获得输出吗…我正在运行这个logstash-f logstash-jdbc.conf,但它没有显示任何输出…所以我不知道它是否已经完成或仍在运行我正在尝试从多个jdbc块中获得类似的结果。有人能帮我获取从第一个JDBC块返回的表值吗?jdbc{…语句=>select*form users…}jdbc{…语句=>select*form customer where user_id='%{users.id}'//如何实现这一点?..}