Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/apache-kafka/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop 无法使用Kafka命令行向Kafka主题/制作人发送json推文事件_Hadoop_Apache Kafka_Apache Zookeeper_Hortonworks Sandbox - Fatal编程技术网

Hadoop 无法使用Kafka命令行向Kafka主题/制作人发送json推文事件

Hadoop 无法使用Kafka命令行向Kafka主题/制作人发送json推文事件,hadoop,apache-kafka,apache-zookeeper,hortonworks-sandbox,Hadoop,Apache Kafka,Apache Zookeeper,Hortonworks Sandbox,我已经创建了一个python脚本raw\u tweets\u stream.py,使用twitter api来流式传输twitter数据。twitter上的json数据通过下面的脚本传递给kafka producer raw_json_tweets是为这些tweets创建的卡夫卡主题。python脚本raw_tweets_stream.py运行正常,但在将其发送给卡夫卡制作人时抛出错误。我正在使用HortonWorksHDP2.3.1沙盒,我已经确保zookeeper和kafka已经启动

我已经创建了一个python脚本
raw\u tweets\u stream.py
,使用twitter api来流式传输twitter数据。twitter上的json数据通过下面的脚本传递给kafka producer



raw_json_tweets
是为这些tweets创建的卡夫卡主题。python脚本
raw_tweets_stream.py
运行正常,但在将其发送给卡夫卡制作人时抛出错误。我正在使用HortonWorksHDP2.3.1沙盒,我已经确保zookeeper和kafka已经启动


/usr/hdp/current/kafka broker/bin/kafka-topics.sh--zookeer localhost:2181--description--topic raw\u json\u tweets

Topic:raw_json_tweets      PartitionCount:1        ReplicationFactor:1     Configs:
            Topic: raw_json_tweets     Partition: 0    Leader: 0       Replicas: 0     Isr: 0
错误:

[2016-08-25 22:36:26,212] ERROR Failed to send requests for topics raw_json_tweets with correlation ids in [57,64] (kafka.producer.async.DefaultEventHandler)
[2016-08-25 22:36:26,213] ERROR Error in handling batch of 131 events (kafka.producer.async.ProducerSendThread)
kafka.common.FailedToSendMessageException: Failed to send messages after 3 tries.
        at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:91)
        at kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:105)
        at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:88)
        at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:68)
        at scala.collection.immutable.Stream.foreach(Stream.scala:547)
        at kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:67)
        at kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:45)
[2016-08-25 22:36:27,217] WARN Fetching topic metadata with correlation id 65 for topics [Set(json_tweets1)] from broker [BrokerEndPoint(0,localhost,2181)] failed (kafka.client.ClientUtils$)
java.io.EOFException: Received -1 when reading from channel, socket has likely been closed.
        at kafka.utils.CoreUtils$.read(CoreUtils.scala:193)
        at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
        at kafka.network.Receive$class.readCompletely(Transmission.scala:56)
        at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
        at kafka.network.BlockingChannel.receive(BlockingChannel.scala:131)
        at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:77)
        at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:74)
        at kafka.producer.SyncProducer.send(SyncProducer.scala:115)
        at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:59)
        at kafka.producer.BrokerPartitionInfo.updateInfo(BrokerPartitionInfo.scala:82)
        at kafka.producer.BrokerPartitionInfo.getBrokerPartitionInfo(BrokerPartitionInfo.scala:49)
        at kafka.producer.async.DefaultEventHandler.kafka$producer$async$DefaultEventHandler$$getPartitionListForTopic(DefaultEventHandler.scala:188)
        at kafka.producer.async.DefaultEventHandler$$anonfun$partitionAndCollate$1.apply(DefaultEventHandler.scala:152)
        at kafka.producer.async.DefaultEventHandler$$anonfun$partitionAndCollate$1.apply(DefaultEventHandler.scala:151)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
        at kafka.producer.async.DefaultEventHandler.partitionAndCollate(DefaultEventHandler.scala:151)
        at kafka.producer.async.DefaultEventHandler.dispatchSerializedData(DefaultEventHandler.scala:96)
        at kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:73)
        at kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:105)
        at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:88)
        at kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:68)
        at scala.collection.immutable.Stream.foreach(Stream.scala:547)
        at kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:67)
        at kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:45)

更新:解决方案


  • 转到Ambari Services,并将Kafka日志目录更改为
    /tmp/Kafka logs
  • 修改原始脚本以包含正确的端口和主机名

    python raw_tweets_stream.py

  • Topic:raw_json_tweets      PartitionCount:1        ReplicationFactor:1     Configs:
                Topic: raw_json_tweets     Partition: 0    Leader: 0       Replicas: 0     Isr: 0
    
  • 已验证是否使用控制台使用者将事件发送到kafka主题

    /usr/hdp/2.3.0.0-2557/kafka/bin/kafka-console-consumer.sh-zookeer sandbox.hortonworks.com:2181-topic raw_json_tweets-从头开始


  • 看起来你把
    --代理列表
    指向zookeeper(
    2181
    ),而你需要指向卡夫卡代理,其默认端口是Ambari上的
    9092
    6667

    谢谢你指出@Binary Nerd。我用正确的端口9092更新了kafka代理,但仍然抛出错误。以下是错误的一部分-->[2016-08-26 13:24:12718]错误未能按主题、分区整理消息,原因是:从代理[ArrayBuffer(BrokerEndPoint(0,localhost,9092))]获取主题[Set(raw_json_tweets)]的主题元数据失败(kafka.producer.async.DefaultEventHandler)。。java.nio.channels.closedchannelexception根据Hortonworks文档,如果您使用的是Ambari,默认端口是
    6667
    ,或许可以尝试一下-您是对的。hortonworks的正确端口为6667(也验证了这一点,并将其发送至ambari Services)。修改脚本以包含正确的端口
    6667
    和完整的主机名
    sandbox.hortonworks.com
    。。工作得很有魅力
    python raw_tweets_stream.py/usr/hdp/current/kafka broker/bin/kafka-console-producer.sh--代理列表sandbox.hortonworks.com:6667--主题raw_json_tweets