Kafka JDBC接收器连接器，批量插入值_Jdbc_Apache Kafka_Apache Kafka Connect

Kafka JDBC接收器连接器，批量插入值

jdbc apache-kafka

Kafka JDBC接收器连接器，批量插入值,jdbc,apache-kafka,apache-kafka-connect,Jdbc,Apache Kafka,Apache Kafka Connect,我每秒收到很多消息（通过http协议）（50000-100000），并希望将它们保存到PostgreSql。为此，我决定使用Kafka JDBC接收器消息通过一条记录保存到数据库中，而不是成批保存。我想在PostgreSQL中批量插入大小为500-1000条记录的记录我在这个问题上找到了一些答案：我尝试在配置中使用相关选项，但似乎没有任何效果 My Kafka JDBC接收器PostgreSql配置（etc/Kafka connect JDBC/postgres.properties）：

我每秒收到很多消息（通过http协议）（50000-100000），并希望将它们保存到PostgreSql。为此，我决定使用Kafka JDBC接收器

消息通过一条记录保存到数据库中，而不是成批保存。我想在PostgreSQL中批量插入大小为500-1000条记录的记录

我在这个问题上找到了一些答案：

我尝试在配置中使用相关选项，但似乎没有任何效果

My Kafka JDBC接收器PostgreSql配置（

etc/Kafka connect JDBC/postgres.properties

）：

我还添加了连接分布式的选项。属性：

name=test-sink
connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
tasks.max=3

# The topics to consume from - required for sink connectors like this one
topics=jsonb_pkgs

connection.url=jdbc:postgresql://localhost:5432/test?currentSchema=test
auto.create=false
auto.evolve=false

insert.mode=insert
connection.user=postgres
table.name.format=${topic}

connection.password=pwd

batch.size=500
# based on 500*3000byte message size
fetch.min.bytes=1500000
fetch.wait.max.ms=1500
max.poll.records=4000

consumer.fetch.min.bytes=1500000
consumer.fetch.wait.max.ms=1500

# based on 500*3000 byte message size
consumer.fetch.min.bytes=1500000
consumer.fetch.wait.max.ms=1500
consumer.max.poll.records=4000

虽然每个分区每秒获得1000条以上的记录，但记录会被一个分区保存到PostgreSQL

编辑：消费者选项以正确的名称添加到其他文件中

我还将选项添加到

etc/schema registry/connect avro standalone.properties

：

name=test-sink
connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
tasks.max=3

# The topics to consume from - required for sink connectors like this one
topics=jsonb_pkgs

connection.url=jdbc:postgresql://localhost:5432/test?currentSchema=test
auto.create=false
auto.evolve=false

insert.mode=insert
connection.user=postgres
table.name.format=${topic}

connection.password=pwd

batch.size=500
# based on 500*3000byte message size
fetch.min.bytes=1500000
fetch.wait.max.ms=1500
max.poll.records=4000

consumer.fetch.min.bytes=1500000
consumer.fetch.wait.max.ms=1500

# based on 500*3000 byte message size
consumer.fetch.min.bytes=1500000
consumer.fetch.wait.max.ms=1500
consumer.max.poll.records=4000

我意识到我误解了文档。这些记录被逐一插入数据库。在一个事务中插入的记录的计数取决于

batch.size

和

consumer.max.poll.records

。我希望批插入是以另一种方式实现的。我希望可以选择插入如下记录：

INSERT INTO table1 (First, Last)
VALUES
    ('Fred', 'Smith'),
    ('John', 'Smith'),
    ('Michael', 'Smith'),
    ('Robert', 'Smith');

但这似乎是不可能的。

应该是

consumer.max.poll.records

，顺便说一句，我试图更改max.poll.records->consumer.max.poll.records，但收到了相同的结果。确定。我只是说，这是正确的财产名称。在任何情况下，记录都应该在单独的查询中发送，我不确定batchesIn Kafka Connect是否有事务规则，是吗@Miguel@Miguel错误的部分。搜索“可以使用相同的参数，但需要分别以producer.和consumer.作为前缀”