Apache kafka 我的MirrorMaker2镜像消费者补偿是如何实现的?为什么这么慢?
我正在使用Mirror Maker 2从一个AWS MSK群集迁移到另一个群集。源集群运行Kafka 2.4.1.1,目标集群运行2.7 我的MirrorMaker2正在使用Kafka 2.7 SDK的M5.large EC2实例上运行 我希望将所有主题和使用者偏移量从Apache kafka 我的MirrorMaker2镜像消费者补偿是如何实现的?为什么这么慢?,apache-kafka,apache-kafka-connect,apache-kafka-mirrormaker,Apache Kafka,Apache Kafka Connect,Apache Kafka Mirrormaker,我正在使用Mirror Maker 2从一个AWS MSK群集迁移到另一个群集。源集群运行Kafka 2.4.1.1,目标集群运行2.7 我的MirrorMaker2正在使用Kafka 2.7 SDK的M5.large EC2实例上运行 我希望将所有主题和使用者偏移量从$SOURCE\u CLUSTER复制到$TARGET\u CLUSTER [ec2-user@ip-x-x-x-x ~]$ k/bin/kafka-consumer-groups.sh --bootstrap-server $S
$SOURCE\u CLUSTER
复制到$TARGET\u CLUSTER
[ec2-user@ip-x-x-x-x ~]$ k/bin/kafka-consumer-groups.sh --bootstrap-server $SOURCE_CLUSTER --describe --group MyConsumerGroup
GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
MyConsumerGroup MyLargeTopic 4 772259 772259 0 <some consumer id> rdkafka
MyConsumerGroup MyLargeTopic 8 821326 821326 0 <some consumer id> rdkafka
MyConsumerGroup MyLargeTopic 9 786077 786077 0 <some consumer id> rdkafka
MyConsumerGroup MyLargeTopic 7 844962 844964 2 <some consumer id> rdkafka
MyConsumerGroup MyLargeTopic 0 784451 784451 0 <some consumer id> rdkafka
MyConsumerGroup MyLargeTopic 3 845682 845682 0 <some consumer id> rdkafka
MyConsumerGroup MyLargeTopic 6 827488 827488 0 <some consumer id> rdkafka
MyConsumerGroup MyLargeTopic 2 843823 843823 0 <some consumer id> rdkafka
MyConsumerGroup MyLargeTopic 5 818343 818343 0 <some consumer id> rdkafka
MyConsumerGroup MyLargeTopic 1 802264 802264 0 <some consumer id> rdkafka
[ec2-user@ip-x-x-x-x ~]$ k/bin/kafka-consumer-groups.sh --bootstrap-server $TARGET_CLUSTER --describe --group MyConsumerGroup
Consumer group 'MyConsumerGroup' has no active members.
GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
MyConsumerGroup MyLargeTopic 7 832171 288324 -543847 - - -
MyConsumerGroup MyLargeTopic 6 814062 260857 -553205 - - -
MyConsumerGroup MyLargeTopic 5 801912 254791 -547121 - - -
MyConsumerGroup MyLargeTopic 4 758982 249167 -509815 - - -
MyConsumerGroup MyLargeTopic 9 770665 238708 -531957 - - -
MyConsumerGroup MyLargeTopic 8 806443 267920 -538523 - - -
MyConsumerGroup MyLargeTopic 3 831331 283500 -547831 - - -
MyConsumerGroup MyLargeTopic 2 831028 250147 -580881 - - -
MyConsumerGroup MyLargeTopic 1 789425 272097 -517328 - - -
MyConsumerGroup MyLargeTopic 0 768326 245568 -522758 - - -
MytestTopic
似乎已正确复制(包括消费者组偏移量)。我相信这一点,因为当我使用kafkacat
从$SOURCE\u CLUSTER
消费testTopic
,然后再从$TARGET\u CLUSTER
消费时,消息不会在目标上重新消费,因为偏移量已在$TARGET\u CLUSTER
上更新(由MirrorMaker),因此消息不会被重新消费
然而,当我检查一些较大的主题时,偏移量似乎正在以每秒2-3次的速度更新,正如我在下面尝试演示的那样
我在这里描述$SOURCE\u CLUSTER
上的组MyConsumerGroup
[ec2-user@ip-x-x-x-x ~]$ k/bin/kafka-consumer-groups.sh --bootstrap-server $SOURCE_CLUSTER --describe --group MyConsumerGroup
GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
MyConsumerGroup MyLargeTopic 4 772259 772259 0 <some consumer id> rdkafka
MyConsumerGroup MyLargeTopic 8 821326 821326 0 <some consumer id> rdkafka
MyConsumerGroup MyLargeTopic 9 786077 786077 0 <some consumer id> rdkafka
MyConsumerGroup MyLargeTopic 7 844962 844964 2 <some consumer id> rdkafka
MyConsumerGroup MyLargeTopic 0 784451 784451 0 <some consumer id> rdkafka
MyConsumerGroup MyLargeTopic 3 845682 845682 0 <some consumer id> rdkafka
MyConsumerGroup MyLargeTopic 6 827488 827488 0 <some consumer id> rdkafka
MyConsumerGroup MyLargeTopic 2 843823 843823 0 <some consumer id> rdkafka
MyConsumerGroup MyLargeTopic 5 818343 818343 0 <some consumer id> rdkafka
MyConsumerGroup MyLargeTopic 1 802264 802264 0 <some consumer id> rdkafka
[ec2-user@ip-x-x-x-x ~]$ k/bin/kafka-consumer-groups.sh --bootstrap-server $TARGET_CLUSTER --describe --group MyConsumerGroup
Consumer group 'MyConsumerGroup' has no active members.
GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID
MyConsumerGroup MyLargeTopic 7 832171 288324 -543847 - - -
MyConsumerGroup MyLargeTopic 6 814062 260857 -553205 - - -
MyConsumerGroup MyLargeTopic 5 801912 254791 -547121 - - -
MyConsumerGroup MyLargeTopic 4 758982 249167 -509815 - - -
MyConsumerGroup MyLargeTopic 9 770665 238708 -531957 - - -
MyConsumerGroup MyLargeTopic 8 806443 267920 -538523 - - -
MyConsumerGroup MyLargeTopic 3 831331 283500 -547831 - - -
MyConsumerGroup MyLargeTopic 2 831028 250147 -580881 - - -
MyConsumerGroup MyLargeTopic 1 789425 272097 -517328 - - -
MyConsumerGroup MyLargeTopic 0 768326 245568 -522758 - - -
上述命令的后续运行显示LOG-END-OFFSET
每秒递增2-3
我的mm2.properties
文件是:
[ec2-user@ip-x-x-x-x ~]$ cat k/config/mm2.properties
clusters = source, target
source.bootstrap.servers=$SOURCE_CLUSTER
target.bootstrap.servers=$TARGET_CLUSTER
# Source and target clusters configurations.
source.config.storage.replication.factor = 3
target.config.storage.replication.factor = 3
source.offset.storage.replication.factor = 3
target.offset.storage.replication.factor = 3
source.status.storage.replication.factor = 3
target.status.storage.replication.factor = 3
source->target.enabled = true
target->source.enabled = false
source->target.sync.group.offsets.enabled=true
source->target.producer.override.**compression.type=gzip
source->target.emit.heartbeats.enabled = true
source->target.emit.checkpoints.enabled = true
source.producer.override.batch.size = 327680
# Mirror maker configurations.
offset-syncs.topic.replication.factor = 3
heartbeats.topic.replication.factor = 3
checkpoints.topic.replication.factor = 3
topics = .*
groups = .*
replication.policy.class=com.amazonaws.kafka.samples.CustomMM2ReplicationPolicy
source.cluster.producer.enable.idempotence = true
target.cluster.producer.enable.idempotence = true
tasks.max = 1
replication.factor = 3
refresh.topics.enabled = true
source.producer.compression.type=gzip
target.producer.compression.type=gzip
source.producer.connections.max.idle.ms=180000
producer.enable.idempotence=true
# Enable heartbeats and checkpoints.
# customize as needed
sync.topic.acls.enabled = false
有人能解释一下为什么LOG-END-OFFSET
在我的目标集群上增长如此缓慢吗?没有消费者连接到$TARGET\u集群
,因此所有更新都是通过MirrorMaker2进行的