Postgresql Postgres Debezium CDC不发布对卡夫卡的更改
我当前的测试配置如下所示:Postgresql Postgres Debezium CDC不发布对卡夫卡的更改,postgresql,apache-kafka,apache-kafka-connect,debezium,Postgresql,Apache Kafka,Apache Kafka Connect,Debezium,我当前的测试配置如下所示: version: '3.7' services: postgres: image: debezium/postgres restart: always ports: - "5432:5432" zookeeper: image: debezium/zookeeper ports: - "2181:2181" - "2888:2888" - "3888:3888" kaf
version: '3.7'
services:
postgres:
image: debezium/postgres
restart: always
ports:
- "5432:5432"
zookeeper:
image: debezium/zookeeper
ports:
- "2181:2181"
- "2888:2888"
- "3888:3888"
kafka:
image: debezium/kafka
restart: always
ports:
- "9092:9092"
links:
- zookeeper
depends_on:
- zookeeper
environment:
- ZOOKEEPER_CONNECT=zookeeper:2181
- KAFKA_GROUP_MIN_SESSION_TIMEOUT_MS=250
connect:
image: debezium/connect
restart: always
ports:
- "8083:8083"
links:
- zookeeper
- postgres
- kafka
depends_on:
- zookeeper
- postgres
- kafka
environment:
- BOOTSTRAP_SERVERS=kafka:9092
- GROUP_ID=1
- CONFIG_STORAGE_TOPIC=my_connect_configs
- OFFSET_STORAGE_TOPIC=my_connect_offsets
- STATUS_STORAGE_TOPIC=my_source_connect_statuses
我使用docker compose运行它,如下所示:
$ docker-compose up
我没有看到任何错误消息。看起来一切都运转正常。如果我执行docker ps
,我会看到所有服务都在运行
为了检查卡夫卡是否正在运行,我用Python制作了卡夫卡制作人和卡夫卡消费者:
# producer. I run it in one console window
from kafka import KafkaProducer
from json import dumps
from time import sleep
producer = KafkaProducer(bootstrap_servers=['localhost:9092'], value_serializer=lambda x: dumps(x).encode('utf-8'))
for e in range(1000):
data = {'number' : e}
producer.send('numtest', value=data)
sleep(5)
# consumer. I run it in other colsole window
from kafka import KafkaConsumer
from json import loads
consumer = KafkaConsumer(
'numtest',
bootstrap_servers=['localhost:9092'],
auto_offset_reset='earliest',
enable_auto_commit=True,
group_id='my-group',
value_deserializer=lambda x: loads(x.decode('utf-8')))
for message in consumer:
print(message)
而且它的效果非常好。我看到我的制作人是如何发布消息的,我看到他们是如何在消费者窗口中消费的
现在我想让CDC运作起来。首先,在Postgres容器中,我将Postgres
角色密码设置为Postgres
:
$ su postgres
$ psql
psql> \password postgres
Enter new password: postgres
然后我创建了一个新的数据库test
:
psql> CREATE DATABASE test;
我创建了一个表:
psql> \c test;
test=# create table mytable (id serial, name varchar(128), primary key(id));
最后,我为我的Debezium CDC堆栈创建了一个连接器:
$ curl -X POST -H "Accept:application/json" -H "Content-Type:application/json" localhost:8083/connectors/ -d '{
"name": "test-connector",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"tasks.max": "1",
"plugin.name": "pgoutput",
"database.hostname": "postgres",
"database.port": "5432",
"database.user": "postgres",
"database.password": "postgres",
"database.dbname" : "test",
"database.server.name": "postgres",
"database.whitelist": "public.mytable",
"database.history.kafka.bootstrap.servers": "localhost:9092",
"database.history.kafka.topic": "public.some_topic"
}
}'
{"name":"test-connector","config":{"connector.class":"io.debezium.connector.postgresql.PostgresConnector","tasks.max":"1","plugin.name":"pgoutput","database.hostname":"postgres","database.port":"5432","database.user":"postgres","database.password":"postgres","database.dbname":"test","database.server.name":"postgres","database.whitelist":"public.mytable","database.history.kafka.bootstrap.servers":"localhost:9092","database.history.kafka.topic":"public.some_topic","name":"test-connector"},"tasks":[],"type":"source"}
如您所见,我的连接器创建时没有任何错误。现在我希望Debezium CDC将所有对卡夫卡主题的更改发布到public.some_topic
。为了检查这一点,我创建了一个新的卡夫卡消费者:
from kafka import KafkaConsumer
from json import loads
consumer = KafkaConsumer(
'public.some_topic',
bootstrap_servers=['localhost:9092'],
auto_offset_reset='earliest',
enable_auto_commit=True,
group_id='my-group',
value_deserializer=lambda x: loads(x.decode('utf-8')))
for message in consumer:
print(message)
与第一个示例的唯一区别是,我正在观看的是public.some\u主题
。然后我转到数据库控制台并进行插入:
test=# insert into mytable (name) values ('Tom Cat');
INSERT 0 1
test=#
所以,插入了一个新的值,但我看到消费者窗口中没有发生任何事情。换句话说,Debezium不会将事件发布到Kafka
public.some_topic
。这有什么问题?我如何修复它?使用Docker Compose创建连接器时,我在Kafka Connect worker日志中看到此错误:
Caused by: org.postgresql.util.PSQLException: ERROR: could not access file "pgoutput": No such file or directory
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2505)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2241)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:310)
at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:447)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:368)
at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:309)
at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:295)
at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:272)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:267)
at io.debezium.connector.postgresql.connection.PostgresReplicationConnection.createReplicationSlot(PostgresReplicationConnection.java:288)
at io.debezium.connector.postgresql.PostgresConnectorTask.start(PostgresConnectorTask.java:126)
... 9 more
如果使用Kafka Connect REST API查询任务,任务的状态也会反映这一点:
curl -s "http://localhost:8083/connectors?expand=info&expand=status" | jq '."test-connector".status'
{
"name": "test-connector",
"connector": {
"state": "RUNNING",
"worker_id": "192.168.16.5:8083"
},
"tasks": [
{
"id": 0,
"state": "FAILED",
"worker_id": "192.168.16.5:8083",
"trace": "org.apache.kafka.connect.errors.ConnectException: org.postgresql.util.PSQLException: ERROR: could not access file \"pgoutput\": No such file or directory\n\tat io.debezium.connector.postgresql.PostgresConnectorTask.start(PostgresConnectorTask.java:129)\n\tat io.debezium.connector.common.BaseSourceTask.start(BaseSourceTask.java:49)\n\tat org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:208)\n\tat org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:177)\n\tat org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:227)\n\tat java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)\n\tat java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat java.base/java.lang.Thread.run(Thread.java:834)\nCaused by: org.postgresql.util.PSQLException: ERROR: could not access file \"pgoutput\": No such file or directory\n\tat org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2505)\n\tat org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2241)\n\tat org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:310)\n\tat org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:447)\n\tat org.postgresql.jdbc.PgStatement.execute(PgStatement.java:368)\n\tat org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:309)\n\tat org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:295)\n\tat org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:272)\n\tat org.postgresql.jdbc.PgStatement.execute(PgStatement.java:267)\n\tat io.debezium.connector.postgresql.connection.PostgresReplicationConnection.createReplicationSlot(PostgresReplicationConnection.java:288)\n\tat io.debezium.connector.postgresql.PostgresConnectorTask.start(PostgresConnectorTask.java:126)\n\t... 9 more\n"
}
],
"type": "source"
您正在运行的Postgres版本是
postgres=#显示服务器版本;
服务器版本
----------------
9.6.16
pgoutput
仅在>=版本10时可用
我将Docker Compose更改为使用版本10:
image: debezium/postgres:10
在跳转堆栈以获得一个干净的开始并遵循您的指示后,我得到一个正在运行的连接器:
curl-s”http://localhost:8083/connectors?expand=info&expand=status" | \
jq.|to_entries[]|[.value.info.type、.key、.value.status.connector.state、.value.status.tasks[].state、.value.info.config.“connector.class”]|加入(“::::”)|\
列-s:-t|sed's/\“//g'| sort
源|测试连接器|运行|运行| io.debezium.connector.postgresql.PostgresConnector
和卡夫卡主题中的数据:
$ docker exec kafkacat kafkacat -b kafka:9092 -t postgres.public.mytable -C
{"schema":{"type":"struct","fields":[{"type":"struct","fields":[{"type":"int32","optional":false,"field":"id"},{"type":"string","optional":true,"field":"name"}],"optional":true,"name":"postgres.public.mytable.Value","field":"before"},{"type":"struct","fields":[{"type":"int32","optional":false,"field":"id"},{"type":"string","optional":true,"field":"name"}],"optional":true,"name":"postgres.public.mytable.Value","field":"after"},{"type":"struct","fields":[{"type":"string","optional":false,"field":"version"},{"type":"string","optional":false,"field":"connector"},{"type":"string","optional":false,"field":"name"},{"type":"int64","optional":false,"field":"ts_ms"},{"type":"string","optional":true,"name":"io.debezium.data.Enum","version":1,"parameters":{"allowed":"true,last,false"},"default":"false","field":"snapshot"},{"type":"string","optional":false,"field":"db"},{"type":"string","optional":false,"field":"schema"},{"type":"string","optional":false,"field":"table"},{"type":"int64","optional":true,"field":"txId"},{"type":"int64","optional":true,"field":"lsn"},{"type":"int64","optional":true,"field":"xmin"}],"optional":false,"name":"io.debezium.connector.postgresql.Source","field":"source"},{"type":"string","optional":false,"field":"op"},{"type":"int64","optional":true,"field":"ts_ms"}],"optional":false,"name":"postgres.public.mytable.Envelope"},"payload":{"before":null,"after":{"id":1,"name":"Tom Cat"},"source":{"version":"1.0.0.Final","connector":"postgresql","name":"postgres","ts_ms":1579172192292,"snapshot":"false","db":"test","schema":"public","table":"mytable","txId":561,"lsn":24485520,"xmin":null},"op":"c","ts_ms":1579172192347}}% Reached end of topic postgres.public.mytable [0] at offset 1
我在Docker Compose中添加了kafkacat,包括:
kafkacat:
image: edenhill/kafkacat:1.5.0
container_name: kafkacat
entrypoint:
- /bin/sh
- -c
- |
while [ 1 -eq 1 ];do sleep 60;done
编辑:保留以前的答案,因为它仍然有用且相关: Debezium会将消息写入。在您的示例中,这将是
postgres.test.mytable
这就是为什么kafkacat
很有用的原因,因为您可以运行
kafkacat -b broker:9092 -L
查看所有主题和分区的列表。获得主题后
kafkacat -b broker:9092 -t postgres.test.mytable -C
从中阅读
查看有关的详细信息,包括如何
还有一个使用Docker Compose的演示,在创建连接器时,我在Kafka Connect worker日志中看到此错误:
Caused by: org.postgresql.util.PSQLException: ERROR: could not access file "pgoutput": No such file or directory
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2505)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2241)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:310)
at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:447)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:368)
at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:309)
at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:295)
at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:272)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:267)
at io.debezium.connector.postgresql.connection.PostgresReplicationConnection.createReplicationSlot(PostgresReplicationConnection.java:288)
at io.debezium.connector.postgresql.PostgresConnectorTask.start(PostgresConnectorTask.java:126)
... 9 more
如果使用Kafka Connect REST API查询任务,任务的状态也会反映这一点:
curl -s "http://localhost:8083/connectors?expand=info&expand=status" | jq '."test-connector".status'
{
"name": "test-connector",
"connector": {
"state": "RUNNING",
"worker_id": "192.168.16.5:8083"
},
"tasks": [
{
"id": 0,
"state": "FAILED",
"worker_id": "192.168.16.5:8083",
"trace": "org.apache.kafka.connect.errors.ConnectException: org.postgresql.util.PSQLException: ERROR: could not access file \"pgoutput\": No such file or directory\n\tat io.debezium.connector.postgresql.PostgresConnectorTask.start(PostgresConnectorTask.java:129)\n\tat io.debezium.connector.common.BaseSourceTask.start(BaseSourceTask.java:49)\n\tat org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:208)\n\tat org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:177)\n\tat org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:227)\n\tat java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)\n\tat java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat java.base/java.lang.Thread.run(Thread.java:834)\nCaused by: org.postgresql.util.PSQLException: ERROR: could not access file \"pgoutput\": No such file or directory\n\tat org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2505)\n\tat org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2241)\n\tat org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:310)\n\tat org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:447)\n\tat org.postgresql.jdbc.PgStatement.execute(PgStatement.java:368)\n\tat org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:309)\n\tat org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:295)\n\tat org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:272)\n\tat org.postgresql.jdbc.PgStatement.execute(PgStatement.java:267)\n\tat io.debezium.connector.postgresql.connection.PostgresReplicationConnection.createReplicationSlot(PostgresReplicationConnection.java:288)\n\tat io.debezium.connector.postgresql.PostgresConnectorTask.start(PostgresConnectorTask.java:126)\n\t... 9 more\n"
}
],
"type": "source"
您正在运行的Postgres版本是
postgres=#显示服务器版本;
服务器版本
----------------
9.6.16
pgoutput
仅在>=版本10时可用
我将Docker Compose更改为使用版本10:
image: debezium/postgres:10
在跳转堆栈以获得一个干净的开始并遵循您的指示后,我得到一个正在运行的连接器:
curl-s”http://localhost:8083/connectors?expand=info&expand=status" | \
jq.| to_entries[]|[.value.info.type、.key、.value.status.connector.state、.value.status.tasks[].state、.value.info.config.“connector.class”]|连接(“:::”)|\
列-s:-t|sed's/\“//g'| sort
源|测试连接器|运行|运行| io.debezium.connector.postgresql.PostgresConnector
和卡夫卡主题中的数据:
$ docker exec kafkacat kafkacat -b kafka:9092 -t postgres.public.mytable -C
{"schema":{"type":"struct","fields":[{"type":"struct","fields":[{"type":"int32","optional":false,"field":"id"},{"type":"string","optional":true,"field":"name"}],"optional":true,"name":"postgres.public.mytable.Value","field":"before"},{"type":"struct","fields":[{"type":"int32","optional":false,"field":"id"},{"type":"string","optional":true,"field":"name"}],"optional":true,"name":"postgres.public.mytable.Value","field":"after"},{"type":"struct","fields":[{"type":"string","optional":false,"field":"version"},{"type":"string","optional":false,"field":"connector"},{"type":"string","optional":false,"field":"name"},{"type":"int64","optional":false,"field":"ts_ms"},{"type":"string","optional":true,"name":"io.debezium.data.Enum","version":1,"parameters":{"allowed":"true,last,false"},"default":"false","field":"snapshot"},{"type":"string","optional":false,"field":"db"},{"type":"string","optional":false,"field":"schema"},{"type":"string","optional":false,"field":"table"},{"type":"int64","optional":true,"field":"txId"},{"type":"int64","optional":true,"field":"lsn"},{"type":"int64","optional":true,"field":"xmin"}],"optional":false,"name":"io.debezium.connector.postgresql.Source","field":"source"},{"type":"string","optional":false,"field":"op"},{"type":"int64","optional":true,"field":"ts_ms"}],"optional":false,"name":"postgres.public.mytable.Envelope"},"payload":{"before":null,"after":{"id":1,"name":"Tom Cat"},"source":{"version":"1.0.0.Final","connector":"postgresql","name":"postgres","ts_ms":1579172192292,"snapshot":"false","db":"test","schema":"public","table":"mytable","txId":561,"lsn":24485520,"xmin":null},"op":"c","ts_ms":1579172192347}}% Reached end of topic postgres.public.mytable [0] at offset 1
我在Docker Compose中添加了kafkacat,包括:
kafkacat:
image: edenhill/kafkacat:1.5.0
container_name: kafkacat
entrypoint:
- /bin/sh
- -c
- |
while [ 1 -eq 1 ];do sleep 60;done
编辑:保留以前的答案,因为它仍然有用且相关: Debezium将向一个电子邮件地址写入消息。在您的示例中,这将是
postgres.test.mytable
这就是为什么kafkacat
很有用的原因,因为您可以运行
kafkacat -b broker:9092 -L
查看所有主题和分区的列表。一旦你了解了主题
kafkacat -b broker:9092 -t postgres.test.mytable -C
从中阅读
查看有关的详细信息,包括如何
还有一个演示1。如果查询连接器的状态,它是否仍在运行?2.Kafka Connect worker日志中是否有任何内容表明Connector出现故障?3.我会使用
kafkacat
来检查主题和生成/使用数据:)@Robin Moffatt。如果我运行docker ps
,我会看到我的connect
服务正在运行。@Robin Moffatt。我刚刚检查了连接器日志,看到有一行代码在重复:INFO | | WorkerSourceTask{id=test-connector2-0}为偏移量提交刷新0条未完成的消息[org.apache.kafka.connect.runtime.WorkerSourceTask]
您解决了这个问题吗,我试图运行docker compose,但我看到一些错误connect|1 | 2020-04-16 06:06:36922 error | | | | WorkerSourceTask{id=test-connector-0}任务引发了一个无法捕获且无法恢复的异常[org.apache.kafka.connect.runtime.WorkerTask]connect|u 1|io.debezium.jdbc.JdbcConnectionException:ERROR:syntax ERROR connect|u 1|at io.debezium.connector.postgresql.connection.PostgresReplicationConnection.initPublication(PostgresReplicationConnection.java:145)1。如果查询连接器的状态,它是否仍在运行?2.Kafka Connect worker日志中是否有任何内容表明Connector出现故障?3.我会使用kafkacat
来检查主题和生成/使用数据:)@Robin Moffatt。如果我运行docker ps
,我会看到我的connect
服务正在运行。@Robin Moffatt。我刚刚检查了connecto