Apache spark spark cassandra java.lang.NoClassDefFoundError:com/datastax/spark/connector/japi/CassandraJavaUtil

Apache spark spark cassandra java.lang.NoClassDefFoundError:com/datastax/spark/connector/japi/CassandraJavaUtil,apache-spark,cassandra,Apache Spark,Cassandra,我编写了以下简单代码: 16/04/26 16:58:46 DEBUG ProtobufRpcEngine: Call: complete took 3ms Exception in thread "main" java.lang.NoClassDefFoundError: com/datastax/spark/connector/japi/CassandraJavaUtil at com.baitic.mcava.lecturahdfssaveincassandra.Trata

我编写了以下简单代码:

16/04/26 16:58:46 DEBUG ProtobufRpcEngine: Call: complete took 3ms
Exception in thread "main" java.lang.NoClassDefFoundError: com/datastax/spark/connector/japi/CassandraJavaUtil
        at com.baitic.mcava.lecturahdfssaveincassandra.TratamientoCSV.main(TratamientoCSV.java:123)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: com.datastax.spark.connector.japi.CassandraJavaUtil
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 10 more
16/04/26 16:58:46 INFO SparkContext: Invoking stop() from shutdown hook
16/04/26 16:58:46 INFO SparkUI: Stopped Spark web UI at http://10.128.0.5:4040
16/04/26 16:58:46 INFO SparkDeploySchedulerBackend: Shutting down all executors
16/04/26 16:58:46 INFO SparkDeploySchedulerBackend: Asking each executor to shut down
16/04/26 16:58:46 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/04/26 16:58:46 INFO MemoryStore: MemoryStore cleared
16/04/26 16:58:46 INFO BlockManager: BlockManager stopped
16/04/26 16:58:46 INFO BlockManagerMaster: BlockManagerMaster stopped
16/04/26 16:58:46 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/04/26 16:58:46 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/04/26 16:58:46 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
16/04/26 16:58:46 INFO SparkContext: Successfully stopped SparkContext
16/04/26 16:58:46 INFO ShutdownHookManager: Shutdown hook called
16/04/26 16:58:46 INFO ShutdownHookManager: Deleting directory /srv/spark/tmp/spark-2bf57fa2-a2d5-4f8a-980c-994e56b61c44
16/04/26 16:58:46 DEBUG Client: stopping client from cache: org.apache.hadoop.ipc.Client@3fb9a67f
16/04/26 16:58:46 DEBUG Client: removing client from cache: org.apache.hadoop.ipc.Client@3fb9a67f
16/04/26 16:58:46 DEBUG Client: stopping actual client because no more references remain: org.apache.hadoop.ipc.Client@3fb9a67f
16/04/26 16:58:46 DEBUG Client: Stopping client
16/04/26 16:58:46 DEBUG Client: IPC Client (2107841088) connection to mcava-master/10.128.0.5:54310 from baiticpruebas2: closed
16/04/26 16:58:46 DEBUG Client: IPC Client (2107841088) connection to mcava-master/10.128.0.5:54310 from baiticpruebas2: stopped, remaining connections 0
16/04/26 16:58:46 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
我还把标志--jars放在驱动程序的路径上,但我总是犯同样的错误,我不明白为什么


我在谷歌引擎工作

提交应用程序时尝试添加软件包

spark.driver.extraClassPath      hdfs://mcava-master:54310/srv/hadoop/data/spark/spark-cassandra-connector-java-assembly-1.6.0-M1-4-g6f01cfe.jar
spark.executor.extraClassPath    hdfs://mcava-master:54310/srv/hadoop/data/spark/spark-cassandra-connector-java-assembly-1.6.0-M1-4-g6f01cfe.jar

我解决了这个问题。。。我制作了一个包含所有依赖项的胖jar,不需要只指明对cassandra连接器的引用,而是只指明对胖jar的引用。

我添加了这个参数来解决这个问题:--packages-datasax:spark-cassandra连接器:1.6.0-M2-s_2.10。

我在Java程序中使用了spark,也遇到了同样的问题。 问题是,因为我没有将spark cassandra连接器包含到我项目的maven依赖项中

$SPARK_HOME/bin/spark-submit --packages datastax:spark-cassandra-connector:1.6.0-M2-s_2.11 ....

com.datasax.spark
spark-cassandra-connector_2.11
2.0.7 
在那之后,我用我所有的依赖项构建了一个胖罐子,它成功了!
也许它会帮助某人

您使用哪个scala版本?是2.11吗?不是,conector是2.10…我使用spark 1.6.1的最新版本,然后我尽快使用sbt conector,我在scala 2.10版本中只找到.jar,因此,我找不到java汇编连接器,但当我搜索CassandraUtils时,我在scala asambly中找到了这个类…我不知道如何解决这个问题当我在本地计算机上工作时,一切都很好,但在google引擎中,我无法解决这个问题…对我没有任何意义,也许你可以使用--软件包datastax:spark cassandra connector:1.6.0-M2-s_2.10?我试过了,但序列化版本有问题,但我试过了--软件包datastax:spark cassandra connector:1.6.0-M1-s_2.10,运行正常,但当应用程序实现了在cassandra中保存数据的代码时,同样的错误输出……我不知道jvm为什么找不到类卡桑德罗:我解决不了这个问题,太令人沮丧了
$SPARK_HOME/bin/spark-submit --packages datastax:spark-cassandra-connector:1.6.0-M2-s_2.11 ....
    <dependency>
        <groupId>com.datastax.spark</groupId>
        <artifactId>spark-cassandra-connector_2.11</artifactId>
        <version>2.0.7</version> <!-- Check actual version in maven repo -->
    </dependency>