从Java中的Spark运行Cassandra时出错-org.apache.Spark.sql.catalyst上的NoClassDefFoundError
我正在使用Cassandra 3.0.3和Spark 1.6.0,并尝试通过将中的旧文档和中的新文档中的代码结合起来来运行 这是我的pom.xml文件从Java中的Spark运行Cassandra时出错-org.apache.Spark.sql.catalyst上的NoClassDefFoundError,java,spark-cassandra-connector,Java,Spark Cassandra Connector,我正在使用Cassandra 3.0.3和Spark 1.6.0,并尝试通过将中的旧文档和中的新文档中的代码结合起来来运行 这是我的pom.xml文件 <?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>muhrafifm</groupId>
<artifactId>spark-cass-twitterdw</artifactId>
<version>1.0</version>
<packaging>jar</packaging>
<build>
<plugins>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.0</version>
<configuration>
<source>1.7</source>
<target>1.7</target>
</configuration>
</plugin>
</plugins>
</build>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.7</maven.compiler.source>
<maven.compiler.target>1.7</maven.compiler.target>
</properties>
<dependencies>
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
<version>3.0.0</version>
</dependency>
<dependency>
<groupId>com.googlecode.json-simple</groupId>
<artifactId>json-simple</artifactId>
<version>1.1.1</version>
<type>jar</type>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<version>1.6.0-M1</version>
<type>jar</type>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector-java_2.10</artifactId>
<version>1.6.0-M1</version>
<type>jar</type>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.6.0</version>
<type>jar</type>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
<version>1.6.0</version>
<type>jar</type>
</dependency>
<dependency>
<groupId>org.apache.thrift</groupId>
<artifactId>libthrift</artifactId>
<version>0.9.1</version>
</dependency>
</dependencies>
这是我得到的错误
16/03/04 13:29:06 INFO Cluster: New Cassandra host /127.0.0.1:9042 added
16/03/04 13:29:06 INFO CassandraConnector: Connected to Cassandra cluster: Test Cluster
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/sql/catalyst/package$ScalaReflectionLock$
at org.apache.spark.sql.catalyst.ReflectionLock$.<init>(ReflectionLock.scala:5)
at org.apache.spark.sql.catalyst.ReflectionLock$.<clinit>(ReflectionLock.scala)
at com.datastax.spark.connector.mapper.ReflectionColumnMapper.<init>(ReflectionColumnMapper.scala:38)
at com.datastax.spark.connector.mapper.JavaBeanColumnMapper.<init>(JavaBeanColumnMapper.scala:10)
at com.datastax.spark.connector.util.JavaApiHelper$.javaBeanColumnMapper(JavaApiHelper.scala:93)
at com.datastax.spark.connector.util.JavaApiHelper.javaBeanColumnMapper(JavaApiHelper.scala)
at com.datastax.spark.connector.japi.CassandraJavaUtil.mapToRow(CassandraJavaUtil.java:1204)
at com.datastax.spark.connector.japi.CassandraJavaUtil.mapToRow(CassandraJavaUtil.java:1222)
at muhrafifm.spark.cass.twitterdw.Demo.generateData(Demo.java:69)
at muhrafifm.spark.cass.twitterdw.Demo.run(Demo.java:35)
at muhrafifm.spark.cass.twitterdw.Demo.main(Demo.java:181)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.catalyst.package$ScalaReflectionLock$
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 11 more
16/03/04 13:29:40 INFO CassandraConnector: Disconnected from Cassandra cluster: Test Cluster
16/03/04 13:29:41 INFO SparkContext: Invoking stop() from shutdown hook
16/03/04 13:29:41 INFO SparkUI: Stopped Spark web UI at http://10.144.233.28:4040
16/03/04 13:29:41 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/03/04 13:29:42 INFO MemoryStore: MemoryStore cleared
16/03/04 13:29:42 INFO BlockManager: BlockManager stopped
16/03/04 13:29:42 INFO BlockManagerMaster: BlockManagerMaster stopped
16/03/04 13:29:42 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/03/04 13:29:42 INFO SparkContext: Successfully stopped SparkContext
16/03/04 13:29:42 INFO ShutdownHookManager: Shutdown hook called
16/03/04 13:29:42 INFO ShutdownHookManager: Deleting directory /tmp/spark- 16fd2ae2-b61b-4411-a776-1e578caabba6
------------------------------------------------------------------------
BUILD FAILURE
16/03/04 13:29:06信息集群:新增卡桑德拉主机/127.0.0.1:9042
16/03/04 13:29:06信息Cassandra连接器:连接到Cassandra群集:测试群集
线程“main”java.lang.NoClassDefFoundError中出现异常:org/apache/spark/sql/catalyst/package$ScalaReflectionLock$
位于org.apache.spark.sql.catalyst.ReflectionLock$(ReflectionLock.scala:5)
位于org.apache.spark.sql.catalyst.ReflectionLock$(ReflectionLock.scala)
位于com.datastax.spark.connector.mapper.ReflectionColumnMapper。(ReflectionColumnMapper.scala:38)
在com.datastax.spark.connector.mapper.JavaBeanColumnMapper.(JavaBeanColumnMapper.scala:10)
在com.datastax.spark.connector.util.JavaApiHelper$.javaBeanColumnMapper(JavaApiHelper.scala:93)上
位于com.datastax.spark.connector.util.JavaApiHelper.javaBeanColumnMapper(JavaApiHelper.scala)
位于com.datastax.spark.connector.japi.CassandraJavaUtil.mapToRow(CassandraJavaUtil.java:1204)
位于com.datastax.spark.connector.japi.CassandraJavaUtil.mapToRow(CassandraJavaUtil.java:1222)
位于muhrafim.spark.cass.twitterdw.Demo.generateData(Demo.java:69)
在muhrafim.spark.cass.twitterdw.Demo.run(Demo.java:35)
位于muhrafim.spark.cass.twitterdw.Demo.main(Demo.java:181)
原因:java.lang.ClassNotFoundException:org.apache.spark.sql.catalyst.package$ScalaReflectionLock$
位于java.net.URLClassLoader.findClass(URLClassLoader.java:381)
位于java.lang.ClassLoader.loadClass(ClassLoader.java:424)
位于sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
位于java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 还有11个
16/03/04 13:29:40信息Cassandra连接器:已从Cassandra群集断开连接:测试群集
16/03/04 13:29:41信息SparkContext:从关闭挂钩调用stop()
16/03/04 13:29:41信息SparkUI:已在停止Spark web UIhttp://10.144.233.28:4040
16/03/04 13:29:41信息MapOutputRackerMasterEndpoint:MapOutputRackerMasterEndpoint已停止!
16/03/04 13:29:42信息内存存储:内存存储已清除
16/03/04 13:29:42信息区块管理器:区块管理器已停止
16/03/04 13:29:42信息BlockManagerMaster:BlockManagerMaster已停止
16/03/04 13:29:42信息OutputCommitCoordinator$OutputCommitCoordinator返回点:OutputCommitCoordinator已停止!
16/03/04 13:29:42信息SparkContext:已成功停止SparkContext
16/03/04 13:29:42信息关闭挂钩管理器:已调用关闭挂钩
16/03/04 13:29:42信息关机挂钩管理器:删除目录/tmp/spark-16fd2ae2-b61b-4411-a776-1E578CAABA6
------------------------------------------------------------------------
构建失败
我做错什么了吗?似乎需要我甚至不用的软件包,有什么可以解决的吗?或者我应该使用以前版本的卡桑德拉火花连接器
感谢您的回复。代码正在寻找
org/apache/spark/sql/catalyst/package$ScalaReflectionLock$
因此,您应该包括spark sql库,该库具有正确的依赖关系。这是我用于此应用程序的POM,它在运行时完全没有任何问题(java版本“1.8.0131”&javac 1.8.0131)。 完整的应用程序可以在这里找到。
4.0.0
火花卡桑德拉流媒体
火花卡桑德拉流媒体
0.0.1-快照
org.apache.spark
spark-core_2.11
2.2.0
假如
org.apache.spark
spark-2.10
2.2.0
假如
com.datasax.spark
spark-cassandra-connector_2.11
2.0.5
com.datasax.spark
spark-cassandra-connector-java_2.10
1.6.0-M1
com.datasax.cassandra
卡桑德拉驱动核心
3.3.0
org.apache.spark
spark-catalyst_2.10
2.2.0
org.apache.spark
spark-sql_2.10
2.2.0
${basedir}/src/main/resources
org.apache.maven.plugins
maven编译器插件
3.6.2
1.8
1.8
我也有同样的问题,问题是Spark版本和Spark Cassandra连接器之间的兼容性。
我使用的是spark 2.3,Cassandra连接器是一个旧版本
此处提供了版本兼容性矩阵:
我能成功地做到这一点 我的scala版本是2.11.12 下面是我的
pom.xml
:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.sashi</groupId>
<artifactId>SalesAnalysis</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>SalesAnalysis</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.2.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector_2.11 -->
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.11</artifactId>
<version>2.0.5</version>
</dependency>
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
<version>3.3.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-catalyst_2.10 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-catalyst_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-all</artifactId>
<version>4.1.17.Final</version>
</dependency>
</dependencies>
</project>
我已经检查了依赖项,并且有spark sql库,我已经尝试导入它,但它没有任何作用。如何包含并确保包含库?要么将库放入工作人员的lib目录,要么在spark submit命令中将其作为--jars选项发送给他们是的,要么存在范围内没有spark sql的问题,要么spark和spark cassandra版本不兼容。
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>spark-cassandra-streaming</groupId>
<artifactId>spark-cassandra-streaming</artifactId>
<version>0.0.1-SNAPSHOT</version>
<dependencies>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.2.0</version>
<scope>provided</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming_2.10 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
<version>2.2.0</version>
<scope>provided</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector_2.11 -->
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.11</artifactId>
<version>2.0.5</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector-java_2.10 -->
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector-java_2.10</artifactId>
<version>1.6.0-M1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.datastax.cassandra/cassandra-driver-core -->
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
<version>3.3.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-catalyst_2.10 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-catalyst_2.10</artifactId>
<version>2.2.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql_2.10 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>2.2.0</version>
</dependency>
</dependencies>
<build>
<resources>
<resource>
<directory>${basedir}/src/main/resources</directory>
</resource>
</resources>
<pluginManagement>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.6.2</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
</plugins>
</pluginManagement>
</build>
</project>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.sashi</groupId>
<artifactId>SalesAnalysis</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>SalesAnalysis</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.2.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector_2.11 -->
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.11</artifactId>
<version>2.0.5</version>
</dependency>
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
<version>3.3.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-catalyst_2.10 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-catalyst_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-all</artifactId>
<version>4.1.17.Final</version>
</dependency>
</dependencies>
</project>
spark-submit --class com.sashi.SalesAnalysis.CassandraSparkSalesAnalysis --packages com.datastax.spark:spark-cassandra-connector_2.11:2.4.0 /home/cloudera/Desktop/spark_ex/Cassandra/sales-analysis.jar