带有偏移管理的Python Kafka使用者_Python_Apache Kafka_Apache Zookeeper_Kafka Consumer Api

带有偏移管理的Python Kafka使用者

python apache-kafka apache-zookeeper

带有偏移管理的Python Kafka使用者,python,apache-kafka,apache-zookeeper,kafka-consumer-api,Python,Apache Kafka,Apache Zookeeper,Kafka Consumer Api,我是卡夫卡的新手，我正在尝试在卡夫卡中设置一个消费者，以便它阅读卡夫卡制作人发布的消息。如果我错了，请纠正我，我理解卡夫卡消费者商店是否在ZooKeeper中抵消了？但是，我没有运行zookeeper实例，我想每隔5分钟进行一次民意调查，看看是否有新消息发布到目前为止，我掌握的代码是： import logging from django.conf import settings import kafka import sys import json bootstrap_servers =

我是卡夫卡的新手，我正在尝试在卡夫卡中设置一个消费者，以便它阅读卡夫卡制作人发布的消息。如果我错了，请纠正我，我理解卡夫卡消费者商店是否在ZooKeeper中抵消了？但是，我没有运行zookeeper实例，我想每隔5分钟进行一次民意调查，看看是否有新消息发布

到目前为止，我掌握的代码是：

import logging
from django.conf import settings
import kafka
import sys
import json

bootstrap_servers = ['localhost:8080']
topicName = 'test-info'
consumer = kafka.KafkaConsumer (topicName, group_id = 'test',bootstrap_servers = 
bootstrap_servers,
auto_offset_reset = 'earliest')

count = 0
#print(consumer.topic)
try:
    for message in consumer:
        #print(type(message.value))
        print("\n")
        print("<>"*20)
        print ("%s:%d:%d: key=%s value=%s" % (message.topic, message.partition,message.offset, message.key, message.value))
        print("--"*20)
        info = json.loads(message.value)

        if info['event'] == "new_record" and info['data']['userId'] == "user1" and info['data']['details']['userTeam'] == "foo":
           count = count + 1
           print(count, info['data']['details']['team'], info['data']['details']['leadername'],info['data']['details']['category'])
        else:
            print("Skipping")

    print(count)


except KeyboardInterrupt:
    sys.exit()

导入日志
从django.conf导入设置
进口卡夫卡
导入系统
导入json
引导服务器=['localhost:8080']
topicName='测试信息'
consumer=kafka.KafkaConsumer（主题名，组id='test'，引导服务器=
引导服务器，
自动偏移量重置=‘最早’）
计数=0
#打印（consumer.topic）
尝试：
消费者信息：
#打印（类型（message.value））
打印（“\n”）
打印（“*20）
打印（“%s:%d:%d:key=%s value=%s”%（message.topic、message.partition、message.offset、message.key、message.value））
打印（“--”*20）
info=json.load（message.value）
如果信息['event']==“new_record”和信息['data']['userId']==“user1”和信息['data']['details']['userTeam']==“foo”：
计数=计数+1
打印（计数、信息['data']['details']['team']、信息['data']['details']['leadername']、信息['data']['details']['category']）
其他：
打印（“跳过”）
打印（计数）
除键盘中断外：
sys.exit（）

如何保存偏移量，以便下次轮询时读取增量数据？任何指示都会有帮助

卡夫卡消费者商店在ZooKeeper中抵消了这一点是真的。因为你没有安装zookeeper。卡夫卡可能使用了内置的动物园管理员

在您的情况下，您不需要做更多的事情，因为您已经设置了组id，

group\u id='test'

。因此，使用者将继续自动使用特定组的最后一个偏移量中的数据。因为它会自动提交zookeeper中的最新偏移量（默认情况下自动提交为True）。有关更多信息，请查看

如果您想每5分钟检查一次以查看是否发布了任何新消息，您可以在consumer for循环中添加

time.sleep（300）