Apache spark 无法将kafka服务器连接到spark
我尝试从kafka服务器向spark传输数据。你猜我失败了。我使用spark 2.2.0和kafka_2.11-0.11.0.1。我将JAR加载到eclipse,并在下面的代码中运行Apache spark 无法将kafka服务器连接到spark,apache-spark,apache-kafka,streaming,Apache Spark,Apache Kafka,Streaming,我尝试从kafka服务器向spark传输数据。你猜我失败了。我使用spark 2.2.0和kafka_2.11-0.11.0.1。我将JAR加载到eclipse,并在下面的代码中运行 package com.defne import java.nio.ByteBuffer import scala.util.Random import org.apache.spark._ import org.apache.spark.streaming.dstream._ import org.apache
package com.defne
import java.nio.ByteBuffer
import scala.util.Random
import org.apache.spark._
import org.apache.spark.streaming.dstream._
import org.apache.spark.streaming.kafka._
import org.apache.spark.SparkConf
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.storage.StorageLevel
import org.apache.log4j.Level
import java.util.regex.Pattern
import java.util.regex.Matcher
import kafka.serializer.StringDecoder
import Utilities._
import org.apache.spark.streaming.kafka
import org.apache.spark.streaming.kafka.KafkaUtils
object KafkaExample {
def main(args: Array[String]) {
val ssc = new StreamingContext("local[*]", "KafkaExample", Seconds(1))
val kafkaParams = Map("metadata.broker.list" -> "kafkaIP:9092", "group.id" -> "console-consumer-9526", "zookeeper.connect" -> "localhost:2181")
val topics = List("logstash_log").toSet
val lines = KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder](ssc, kafkaParams, topics).map(_._2)
lines.print()
ssc.checkpoint("C:/checkpoint/")
ssc.start()
ssc.awaitTermination()
}
}
我得到了低于输出。有趣的是不存在错误,但不知何故我无法连接到卡夫卡服务器
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
17/11/01 10:16:55 INFO SparkContext: Running Spark version 2.2.0
17/11/01 10:16:56 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/11/01 10:16:56 INFO SparkContext: Submitted application: KafkaExample
17/11/01 10:16:56 INFO SecurityManager: Changing view acls to: user
17/11/01 10:16:56 INFO SecurityManager: Changing modify acls to: user
17/11/01 10:16:56 INFO SecurityManager: Changing view acls groups to:
17/11/01 10:16:56 INFO SecurityManager: Changing modify acls groups to:
17/11/01 10:16:56 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(user); groups with view permissions: Set(); users with modify permissions: Set(user); groups with modify permissions: Set()
17/11/01 10:16:58 INFO Utils: Successfully started service 'sparkDriver' on port 53749.
17/11/01 10:16:59 INFO SparkEnv: Registering MapOutputTracker
17/11/01 10:16:59 INFO SparkEnv: Registering BlockManagerMaster
17/11/01 10:16:59 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
17/11/01 10:16:59 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
17/11/01 10:16:59 INFO DiskBlockManager: Created local directory at C:\Users\user\AppData\Local\Temp\blockmgr-2fa455d5-ef26-4fb9-ba4b-caf9f2fa3a68
17/11/01 10:16:59 INFO MemoryStore: MemoryStore started with capacity 897.6 MB
17/11/01 10:16:59 INFO SparkEnv: Registering OutputCommitCoordinator
17/11/01 10:16:59 INFO Utils: Successfully started service 'SparkUI' on port 4040.
17/11/01 10:17:00 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://192.168.56.1:4040
17/11/01 10:17:00 INFO Executor: Starting executor ID driver on host localhost
17/11/01 10:17:00 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 53770.
17/11/01 10:17:00 INFO NettyBlockTransferService: Server created on 192.168.56.1:53770
17/11/01 10:17:00 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
17/11/01 10:17:00 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 192.168.56.1, 53770, None)
17/11/01 10:17:00 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.56.1:53770 with 897.6 MB RAM, BlockManagerId(driver, 192.168.56.1, 53770, None)
17/11/01 10:17:00 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 192.168.56.1, 53770, None)
17/11/01 10:17:00 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 192.168.56.1, 53770, None)
17/11/01 10:17:01 INFO VerifiableProperties: Verifying properties
17/11/01 10:17:01 INFO VerifiableProperties: Property group.id is overridden to console-consumer-9526
17/11/01 10:17:01 INFO VerifiableProperties: Property zookeeper.connect is overridden to localhost:2181
17/11/01 10:17:02 INFO SimpleConsumer: Reconnect due to error:
java.lang.NoSuchMethodError: org.apache.kafka.common.network.NetworkSend.<init>(Ljava/lang/String;Ljava/nio/ByteBuffer;)V
at kafka.network.RequestOrResponseSend.<init>(RequestOrResponseSend.scala:41)
at kafka.network.RequestOrResponseSend.<init>(RequestOrResponseSend.scala:44)
at kafka.network.BlockingChannel.send(BlockingChannel.scala:114)
at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:88)
at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:86)
at kafka.consumer.SimpleConsumer.send(SimpleConsumer.scala:114)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$getPartitionMetadata$1.apply(KafkaCluster.scala:126)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$getPartitionMetadata$1.apply(KafkaCluster.scala:125)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$org$apache$spark$streaming$kafka$KafkaCluster$$withBrokers$1.apply(KafkaCluster.scala:346)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$org$apache$spark$streaming$kafka$KafkaCluster$$withBrokers$1.apply(KafkaCluster.scala:342)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)
at org.apache.spark.streaming.kafka.KafkaCluster.org$apache$spark$streaming$kafka$KafkaCluster$$withBrokers(KafkaCluster.scala:342)
at org.apache.spark.streaming.kafka.KafkaCluster.getPartitionMetadata(KafkaCluster.scala:125)
at org.apache.spark.streaming.kafka.KafkaCluster.getPartitions(KafkaCluster.scala:112)
at org.apache.spark.streaming.kafka.KafkaUtils$.getFromOffsets(KafkaUtils.scala:211)
at org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:484)
at com.defne.KafkaExample$.main(KafkaExample.scala:44)
at com.defne.KafkaExample.main(KafkaExample.scala)
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.kafka.common.network.NetworkSend.<init>(Ljava/lang/String;Ljava/nio/ByteBuffer;)V
at kafka.network.RequestOrResponseSend.<init>(RequestOrResponseSend.scala:41)
at kafka.network.RequestOrResponseSend.<init>(RequestOrResponseSend.scala:44)
at kafka.network.BlockingChannel.send(BlockingChannel.scala:114)
at kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:101)
at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:86)
at kafka.consumer.SimpleConsumer.send(SimpleConsumer.scala:114)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$getPartitionMetadata$1.apply(KafkaCluster.scala:126)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$getPartitionMetadata$1.apply(KafkaCluster.scala:125)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$org$apache$spark$streaming$kafka$KafkaCluster$$withBrokers$1.apply(KafkaCluster.scala:346)
at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$org$apache$spark$streaming$kafka$KafkaCluster$$withBrokers$1.apply(KafkaCluster.scala:342)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)
at org.apache.spark.streaming.kafka.KafkaCluster.org$apache$spark$streaming$kafka$KafkaCluster$$withBrokers(KafkaCluster.scala:342)
at org.apache.spark.streaming.kafka.KafkaCluster.getPartitionMetadata(KafkaCluster.scala:125)
at org.apache.spark.streaming.kafka.KafkaCluster.getPartitions(KafkaCluster.scala:112)
at org.apache.spark.streaming.kafka.KafkaUtils$.getFromOffsets(KafkaUtils.scala:211)
at org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:484)
at com.defne.KafkaExample$.main(KafkaExample.scala:44)
at com.defne.KafkaExample.main(KafkaExample.scala)
17/11/01 10:17:02 INFO SparkContext: Invoking stop() from shutdown hook
17/11/01 10:17:02 INFO SparkUI: Stopped Spark web UI at http://192.168.56.1:4040
17/11/01 10:17:02 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
17/11/01 10:17:02 INFO MemoryStore: MemoryStore cleared
17/11/01 10:17:02 INFO BlockManager: BlockManager stopped
17/11/01 10:17:02 INFO BlockManagerMaster: BlockManagerMaster stopped
17/11/01 10:17:02 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
17/11/01 10:17:02 INFO SparkContext: Successfully stopped SparkContext
17/11/01 10:17:02 INFO ShutdownHookManager: Shutdown hook called
17/11/01 10:17:02 INFO ShutdownHookManager: Deleting directory C:\Users\user\AppData\Local\Temp\spark-a584950c-10ca-422b-990e-fd1980e2260c
使用Spark的默认log4j配置文件:org/apache/Spark/log4j-defaults.properties
17/11/01 10:16:55信息SparkContext:运行Spark版本2.2.0
17/11/01 10:16:56警告NativeCodeLoader:无法为您的平台加载本机hadoop库。。。在适用的情况下使用内置java类
17/11/01 10:16:56信息SparkContext:提交的申请:卡夫卡示例
17/11/01 10:16:56信息安全管理器:将视图ACL更改为:用户
17/11/01 10:16:56信息安全管理器:将修改ACL更改为:用户
17/11/01 10:16:56信息安全管理器:将视图ACL组更改为:
17/11/01 10:16:56信息安全管理器:将修改ACL组更改为:
17/11/01 10:16:56信息安全管理器:安全管理器:身份验证已禁用;ui ACL被禁用;具有查看权限的用户:设置(用户);具有查看权限的组:Set();具有修改权限的用户:设置(用户);具有修改权限的组:Set()
17/11/01 10:16:58信息提示:已在端口53749上成功启动服务“sparkDriver”。
17/11/01 10:16:59信息SparkEnv:正在注册MapOutputTracker
17/11/01 10:16:59信息SparkEnv:正在注册BlockManagerMaster
17/11/01 10:16:59信息块管理器MasterEndpoint:使用org.apache.spark.storage.DefaultTopologyMapper获取拓扑信息
17/11/01 10:16:59信息BlockManagerMasterEndpoint:BlockManagerMasterEndpoint向上
17/11/01 10:16:59信息DiskBlockManager:已在C:\Users\user\AppData\local\Temp\blockmgr-2fa455d5-ef26-4fb9-ba4b-caf9f2fa3a68创建本地目录
17/11/01 10:16:59信息MemoryStore:MemoryStore以897.6 MB的容量启动
17/11/01 10:16:59信息SparkEnv:正在注册OutputCommitCoordinator
17/11/01 10:16:59信息实用程序:已在端口4040上成功启动服务“SparkUI”。
17/11/01 10:17:00信息SparkUI:将SparkUI绑定到0.0.0.0,并从http://192.168.56.1:4040
17/11/01 10:17:00信息执行器:正在主机localhost上启动执行器ID驱动程序
17/11/01 10:17:00信息实用程序:已在端口53770上成功启动服务“org.apache.spark.network.netty.NettyBlockTransferService”。
17/11/01 10:17:00信息NettyBlockTransferService:服务器创建于192.168.56.1:53770
17/11/01 10:17:00信息块管理器:使用org.apache.spark.storage.RandomBlockReplicationPolicy作为块复制策略
17/11/01 10:17:00信息BlockManagerMaster:注册BlockManager BlockManagerId(驱动程序,192.168.56.153770,无)
17/11/01 10:17:00信息BlockManagerMasterEndpoint:使用897.6 MB RAM注册块管理器192.168.56.1:53770,BlockManagerId(驱动程序,192.168.56.1,53770,无)
17/11/01 10:17:00信息BlockManagerMaster:Registered BlockManager BlockManagerId(驱动程序,192.168.56.153770,无)
17/11/01 10:17:00信息块管理器:初始化的块管理器:块管理器ID(驱动程序,192.168.56.153770,无)
17/11/01 10:17:01信息可验证属性:验证属性
17/11/01 10:17:01信息可验证属性:属性组.id被覆盖为console-consumer-9526
17/11/01 10:17:01信息可验证属性:属性zookeeper.connect被重写为本地主机:2181
17/11/01 10:17:02信息简单消费者:由于错误而重新连接:
java.lang.NoSuchMethodError:org.apache.kafka.common.network.NetworkSend.(Ljava/lang/String;Ljava/nio/ByteBuffer;)V
在kafka.network.RequestOrResponseSend.(RequestOrResponseSend.scala:41)
在kafka.network.RequestOrResponseSend.(RequestOrResponseSend.scala:44)
在kafka.network.BlockingChannel.send(BlockingChannel.scala:114)
在kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:88)
位于kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:86)
在kafka.consumer.SimpleConsumer.send(SimpleConsumer.scala:114)
位于org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$getPartitionMetadata$1.apply(KafkaCluster.scala:126)
位于org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$getPartitionMetadata$1.apply(KafkaCluster.scala:125)
位于org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$org$apache$spark$streaming$kafka$KafkaCluster$$withBrokers$1.apply(KafkaCluster.scala:346)
位于org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$org$apache$spark$streaming$kafka$KafkaCluster$$withBrokers$1.apply(KafkaCluster.scala:342)
在scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
位于scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)
在org.apache.spark.streaming.kafka.KafkaCluster.org$apache$spark$streaming$kafka$KafkaCluster$$withBrokers(KafkaCluster.scala:342)
位于org.apache.spark.streaming.kafka.KafkaCluster.getPartitionMetadata(KafkaCluster.scala:125)
位于org.apache.spark.streaming.kafka.KafkaCluster.getPartitions(KafkaCluster.scala:112)
在org.apache.spark.streaming.kafka.KafkaUtils$.getFromOffset上(KafkaUtils.scala:211)
在org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream上(KafkaUtils.scala:484)
在com.defne.KafkaExample$.main上(KafkaExample.scala:44)
位于com.defne.KafkaExample.main(KafkaExample.scala)
线程“main”java.lang.NoSuchMethodError中出现异常:org.apache.kafka.common.NetworkSend。(Ljava/lang/String;Ljava/nio/ByteBuffer;)V
在kafka.network.RequestOrResponseSend.(RequestOrResponseSend.scala:41)
在kafka.network.RequestOrResponseSend.(RequestOrResponseSend.scala:44)
在kafka.network.BlockingChannel.send(BlockingChannel.scala:114)
在kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:101)
位于kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:86)
在kafka.consumer.SimpleConsumer.send(SimpleConsumer.scala:114)
在org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$ge上
import kafka.serializer.StringDecoder
import org.apache.spark.SparkConf
import org.apache.spark.streaming.{Seconds, StreamingContext}
import org.apache.spark.streaming.kafka.KafkaUtils
val ssc = new StreamingContext(new SparkConf, Seconds(60))
// hostname:port for Kafka brokers, not Zookeeper
val kafkaParams = Map("metadata.broker.list" -> "localhost:9092,anotherhost:9092")
val topics = Set("sometopic", "anothertopic")
val stream = KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder](ssc, kafkaParams, topics)