Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/308.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Pyspark-SparkStreaming-pythonv3.5-java.lang.ClassCastException:java.lang.Integer不能强制转换为java.lang.Long_Python_Apache Spark_Pyspark_Apache Kafka_Spark Streaming - Fatal编程技术网

Pyspark-SparkStreaming-pythonv3.5-java.lang.ClassCastException:java.lang.Integer不能强制转换为java.lang.Long

Pyspark-SparkStreaming-pythonv3.5-java.lang.ClassCastException:java.lang.Integer不能强制转换为java.lang.Long,python,apache-spark,pyspark,apache-kafka,spark-streaming,Python,Apache Spark,Pyspark,Apache Kafka,Spark Streaming,我不熟悉spark streaming,尝试读取Kafka broker中的数据 下面是我的代码: 最后一步是打印我从代理获取的任何内容,但得到下面的错误消息 Traceback (most recent call last): File "C:/Users/<user>/PycharmProjects/GCPProject/SStreaming.py", line 72, in <module> objss.StreamingObject() F

我不熟悉spark streaming,尝试读取Kafka broker中的数据

下面是我的代码:

最后一步是打印我从代理获取的任何内容,但得到下面的错误消息

    Traceback (most recent call last):
  File "C:/Users/<user>/PycharmProjects/GCPProject/SStreaming.py", line 72, in <module>
    objss.StreamingObject()
  File "C:/Users/<user>/PycharmProjects/GCPProject/SStreaming.py", line 40, in StreamingObject
    kvs = KafkaUtils.createDirectStream(self.ssc, [topic], kafkaParams, fromOffsets = fromoffset)
  File "C:\spark\spark-2.4.0-bin-hadoop2.7\spark-2.4.0-bin-hadoop2.7\python\lib\pyspark.zip\pyspark\streaming\kafka.py", line 130, in createDirectStream
  File "C:\spark\spark-2.4.0-bin-hadoop2.7\spark-2.4.0-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip\py4j\java_gateway.py", line 1133, in __call__
  File "C:\spark\spark-2.4.0-bin-hadoop2.7\spark-2.4.0-bin-hadoop2.7\python\lib\pyspark.zip\pyspark\sql\utils.py", line 63, in deco
  File "C:\spark\spark-2.4.0-bin-hadoop2.7\spark-2.4.0-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip\py4j\protocol.py", line 319, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o37.createDirectStreamWithoutMessageHandler.
: java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Long
        at org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper$$anonfun$17.apply(KafkaUtils.scala:717)
        at scala.collection.MapLike$MappedValues$$anonfun$foreach$3.apply(MapLike.scala:245)
        at scala.collection.MapLike$MappedValues$$anonfun$foreach$3.apply(MapLike.scala:245)
        at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
        at scala.collection.Iterator$class.foreach(Iterator.scala:893)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
        at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
        at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
        at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
        at scala.collection.MapLike$MappedValues.foreach(MapLike.scala:245)
        at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
        at scala.collection.TraversableOnce$class.copyToBuffer(TraversableOnce.scala:275)
        at scala.collection.AbstractTraversable.copyToBuffer(Traversable.scala:104)
        at scala.collection.MapLike$class.toBuffer(MapLike.scala:326)
        at scala.collection.AbstractMap.toBuffer(Map.scala:59)
        at scala.collection.MapLike$class.toSeq(MapLike.scala:323)
        at scala.collection.AbstractMap.toSeq(Map.scala:59)
        at org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper.createDirectStream(KafkaUtils.scala:717)
        at org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper.createDirectStreamWithoutMessageHandler(KafkaUtils.scala:688)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:280)
        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
        at py4j.commands.CallCommand.execute(CallCommand.java:79)
        at py4j.GatewayConnection.run(GatewayConnection.java:214)
        at java.lang.Thread.run(Unknown Source)

19/09/18 23:23:43 INFO SparkContext: Invoking stop() from shutdown hook
19/09/18 23:23:43 INFO SparkUI: Stopped Spark web UI at http://192.168.1.6:4040
19/09/18 23:23:43 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
19/09/18 23:23:43 INFO MemoryStore: MemoryStore cleared
19/09/18 23:23:43 INFO BlockManager: BlockManager stopped
19/09/18 23:23:43 INFO BlockManagerMaster: BlockManagerMaster stopped
19/09/18 23:23:43 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
19/09/18 23:23:43 INFO SparkContext: Successfully stopped SparkContext
19/09/18 23:23:43 INFO ShutdownHookManager: Shutdown hook called
19/09/18 23:23:43 INFO ShutdownHookManager: Deleting directory C:\Users\<user>\AppData\Local\Temp\spark-4ac3750b-cdf3-4d1d-823c-2b60f62db15a
19/09/18 23:23:43 INFO ShutdownHookManager: Deleting directory C:\Users\<user>\AppData\Local\Temp\spark-4ac3750b-cdf3-4d1d-823c-2b60f62db15a\pyspark-e791b26d-bacb-47ab-b7ae-2ae66a811158
回溯(最近一次呼叫最后一次):
文件“C:/Users//PycharmProjects/GCPProject/SStreaming.py”,第72行,在
objss.StreamingObject()
StreamingObject中第40行的文件“C:/Users//PycharmProjects/GCPProject/SStreaming.py”
kvs=KafkaUtils.createDirectStream(self.ssc[topic],kafkaParams,fromfoffset=fromfoffset)
createDirectStream中的文件“C:\spark\spark-2.4.0-bin-hadoop2.7\spark-2.4.0-bin-hadoop2.7\python\lib\pyspark.zip\pyspark\streaming\kafka.py”,第130行
文件“C:\spark\spark-2.4.0-bin-hadoop2.7\spark-2.4.0-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip\py4j\java\u gateway.py”,第1133行,在\uu调用中__
文件“C:\spark\spark-2.4.0-bin-hadoop2.7\spark-2.4.0-bin-hadoop2.7\python\lib\pyspark.zip\pyspark\sql\utils.py”,第63行,deco格式
文件“C:\spark\spark-2.4.0-bin-hadoop2.7\spark-2.4.0-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip\py4j\protocol.py”,第319行,在get\u返回值中
py4j.protocol.Py4JJavaError:调用o37.createDirectStreamWithoutMessageHandler时出错。
:java.lang.ClassCastException:java.lang.Integer不能转换为java.lang.Long
在org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper$$anonfun$17.apply上(KafkaUtils.scala:717)
在scala.collection.MapLike$MappedValues$$anonfun$foreach$3.apply(MapLike.scala:245)
在scala.collection.MapLike$MappedValues$$anonfun$foreach$3.apply(MapLike.scala:245)
在scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply处(TraversableLike.scala:733)
位于scala.collection.Iterator$class.foreach(Iterator.scala:893)
位于scala.collection.AbstractIterator.foreach(迭代器.scala:1336)
位于scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
位于scala.collection.AbstractIterable.foreach(Iterable.scala:54)
位于scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
位于scala.collection.MapLike$MappedValues.foreach(MapLike.scala:245)
在scala.collection.generic.growtable$class.$plus$plus$eq(growtable.scala:59)
在scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
在scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
位于scala.collection.TraversableOnce$class.copyToBuffer(TraversableOnce.scala:275)
位于scala.collection.AbstractTraversable.copyToBuffer(Traversable.scala:104)
位于scala.collection.MapLike$class.toBuffer(MapLike.scala:326)
位于scala.collection.AbstractMap.toBuffer(Map.scala:59)
位于scala.collection.MapLike$class.toSeq(MapLike.scala:323)
位于scala.collection.AbstractMap.toSeq(Map.scala:59)
在org.apache.spark.streaming.kafka.kafkautillspythonhelper.createDirectStream(KafkaUtils.scala:717)
在org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper.createDirectStreamWithoutMessageHandler(KafkaUtils.scala:688)上
在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)处
位于sun.reflect.NativeMethodAccessorImpl.invoke(未知源)
在sun.reflect.DelegatingMethodAccessorImpl.invoke处(未知源)
位于java.lang.reflect.Method.invoke(未知源)
位于py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
位于py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
在py4j.Gateway.invoke处(Gateway.java:280)
位于py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
在py4j.commands.CallCommand.execute(CallCommand.java:79)
在py4j.GatewayConnection.run处(GatewayConnection.java:214)
位于java.lang.Thread.run(未知源)
19/09/18 23:23:43信息SparkContext:从关机挂钩调用stop()
19/09/18 23:23:43信息SparkUI:已在停止Spark web UIhttp://192.168.1.6:4040
19/09/18 23:23:43信息MapOutputRackerMasterEndpoint:MapOutputRackerMasterEndpoint已停止!
19/09/18 23:23:43信息记忆存储:记忆存储已清除
19/09/18 23:23:43信息区块管理器:区块管理器已停止
19/09/18 23:23:43信息BlockManagerMaster:BlockManagerMaster已停止
19/09/18 23:23:43信息OutputCommitCoordinator$OutputCommitCoordinator返回点:OutputCommitCoordinator已停止!
19/09/18 23:23:43信息SparkContext:已成功停止SparkContext
19/09/18 23:23:43信息关闭挂钩管理器:已调用关闭挂钩
19/09/18 23:23:43信息关机挂钩管理器:删除目录C:\Users\\AppData\Local\Temp\spark-4ac3750b-cdf3-4d1d-823c-2b60f62db15a
19/09/18 23:23:43信息关机挂钩管理器:删除目录C:\Users\\AppData\Local\Temp\spark-4ac3750b-cdf3-4d1d-823c-2b60f62db15a\pyspark-e791b26d-bacb-47ab-b7ae-2ae66a811158
数据为CSV格式,存在于Kafka broker中。我不知道问题出在哪里。请帮我从卡夫卡经纪人那里获取消息


我正在使用
Spark 2.2.0
Spark streaming kafka 0.9.0
并在
windows
中设置此环境错误
java.lang.NoClassDefFoundError:scala/collection/GenTraversableOnce$class
发生,可能是因为您的本地scala版本与spark依赖的scala版本不匹配


请检查您的scala版本。Spark 2.2.0使用Scala 2.11

Spark streaming kafka 0.9.0不存在……该链接称Spark streaming kafka 0.8.2.1及更高版本。该组是
org.apache.kafka
,您不应该将其包含在
流式kafka
中。那么,同样地,
org.apache.spark
依赖关系不存在。那么,您是否使用了
0.8.2.1
,那么?我使用了错误的Jar。现在我在线程“thread-3”java.lang.NoClassDefFoundError中得到
异常:kafka/common/TopicAndPartit
    Traceback (most recent call last):
  File "C:/Users/<user>/PycharmProjects/GCPProject/SStreaming.py", line 72, in <module>
    objss.StreamingObject()
  File "C:/Users/<user>/PycharmProjects/GCPProject/SStreaming.py", line 40, in StreamingObject
    kvs = KafkaUtils.createDirectStream(self.ssc, [topic], kafkaParams, fromOffsets = fromoffset)
  File "C:\spark\spark-2.4.0-bin-hadoop2.7\spark-2.4.0-bin-hadoop2.7\python\lib\pyspark.zip\pyspark\streaming\kafka.py", line 130, in createDirectStream
  File "C:\spark\spark-2.4.0-bin-hadoop2.7\spark-2.4.0-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip\py4j\java_gateway.py", line 1133, in __call__
  File "C:\spark\spark-2.4.0-bin-hadoop2.7\spark-2.4.0-bin-hadoop2.7\python\lib\pyspark.zip\pyspark\sql\utils.py", line 63, in deco
  File "C:\spark\spark-2.4.0-bin-hadoop2.7\spark-2.4.0-bin-hadoop2.7\python\lib\py4j-0.10.4-src.zip\py4j\protocol.py", line 319, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o37.createDirectStreamWithoutMessageHandler.
: java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Long
        at org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper$$anonfun$17.apply(KafkaUtils.scala:717)
        at scala.collection.MapLike$MappedValues$$anonfun$foreach$3.apply(MapLike.scala:245)
        at scala.collection.MapLike$MappedValues$$anonfun$foreach$3.apply(MapLike.scala:245)
        at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
        at scala.collection.Iterator$class.foreach(Iterator.scala:893)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
        at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
        at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
        at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
        at scala.collection.MapLike$MappedValues.foreach(MapLike.scala:245)
        at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
        at scala.collection.TraversableOnce$class.copyToBuffer(TraversableOnce.scala:275)
        at scala.collection.AbstractTraversable.copyToBuffer(Traversable.scala:104)
        at scala.collection.MapLike$class.toBuffer(MapLike.scala:326)
        at scala.collection.AbstractMap.toBuffer(Map.scala:59)
        at scala.collection.MapLike$class.toSeq(MapLike.scala:323)
        at scala.collection.AbstractMap.toSeq(Map.scala:59)
        at org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper.createDirectStream(KafkaUtils.scala:717)
        at org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper.createDirectStreamWithoutMessageHandler(KafkaUtils.scala:688)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:280)
        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
        at py4j.commands.CallCommand.execute(CallCommand.java:79)
        at py4j.GatewayConnection.run(GatewayConnection.java:214)
        at java.lang.Thread.run(Unknown Source)

19/09/18 23:23:43 INFO SparkContext: Invoking stop() from shutdown hook
19/09/18 23:23:43 INFO SparkUI: Stopped Spark web UI at http://192.168.1.6:4040
19/09/18 23:23:43 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
19/09/18 23:23:43 INFO MemoryStore: MemoryStore cleared
19/09/18 23:23:43 INFO BlockManager: BlockManager stopped
19/09/18 23:23:43 INFO BlockManagerMaster: BlockManagerMaster stopped
19/09/18 23:23:43 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
19/09/18 23:23:43 INFO SparkContext: Successfully stopped SparkContext
19/09/18 23:23:43 INFO ShutdownHookManager: Shutdown hook called
19/09/18 23:23:43 INFO ShutdownHookManager: Deleting directory C:\Users\<user>\AppData\Local\Temp\spark-4ac3750b-cdf3-4d1d-823c-2b60f62db15a
19/09/18 23:23:43 INFO ShutdownHookManager: Deleting directory C:\Users\<user>\AppData\Local\Temp\spark-4ac3750b-cdf3-4d1d-823c-2b60f62db15a\pyspark-e791b26d-bacb-47ab-b7ae-2ae66a811158