从Java中的Spark运行Cassandra时出错-org.apache.Spark.sql.catalyst上的NoClassDefFoundError

从Java中的Spark运行Cassandra时出错-org.apache.Spark.sql.catalyst上的NoClassDefFoundError,java,spark-cassandra-connector,Java,Spark Cassandra Connector,我正在使用Cassandra 3.0.3和Spark 1.6.0,并尝试通过将中的旧文档和中的新文档中的代码结合起来来运行 这是我的pom.xml文件 <?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="

我正在使用Cassandra 3.0.3和Spark 1.6.0,并尝试通过将中的旧文档和中的新文档中的代码结合起来来运行

这是我的pom.xml文件

<?xml version="1.0" encoding="UTF-8"?>
 <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
 <modelVersion>4.0.0</modelVersion>
 <groupId>muhrafifm</groupId>
 <artifactId>spark-cass-twitterdw</artifactId>
 <version>1.0</version>
 <packaging>jar</packaging>
 <build>
    <plugins>
      <plugin>
          <artifactId>maven-compiler-plugin</artifactId>
          <version>3.0</version>
          <configuration>
              <source>1.7</source>
              <target>1.7</target>
          </configuration>
      </plugin>
    </plugins>
</build>
<properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <maven.compiler.source>1.7</maven.compiler.source>
    <maven.compiler.target>1.7</maven.compiler.target>
</properties>
<dependencies>        
    <dependency>
        <groupId>com.datastax.cassandra</groupId>
        <artifactId>cassandra-driver-core</artifactId>
        <version>3.0.0</version>
    </dependency>
    <dependency>
        <groupId>com.googlecode.json-simple</groupId>
        <artifactId>json-simple</artifactId>
        <version>1.1.1</version>
        <type>jar</type>    
    </dependency>
    <dependency>
        <groupId>com.datastax.spark</groupId>
        <artifactId>spark-cassandra-connector_2.10</artifactId>
        <version>1.6.0-M1</version>
        <type>jar</type>
    </dependency>
    <dependency>
        <groupId>com.datastax.spark</groupId>
        <artifactId>spark-cassandra-connector-java_2.10</artifactId>
        <version>1.6.0-M1</version>
        <type>jar</type>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.10</artifactId>
        <version>1.6.0</version>
        <type>jar</type>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming_2.10</artifactId>
        <version>1.6.0</version>
        <type>jar</type>
    </dependency>
    <dependency>
        <groupId>org.apache.thrift</groupId>
        <artifactId>libthrift</artifactId>
        <version>0.9.1</version>
     </dependency>
</dependencies>
这是我得到的错误

16/03/04 13:29:06 INFO Cluster: New Cassandra host /127.0.0.1:9042 added
16/03/04 13:29:06 INFO CassandraConnector: Connected to Cassandra cluster: Test Cluster
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/sql/catalyst/package$ScalaReflectionLock$
    at org.apache.spark.sql.catalyst.ReflectionLock$.<init>(ReflectionLock.scala:5)
    at org.apache.spark.sql.catalyst.ReflectionLock$.<clinit>(ReflectionLock.scala)
    at com.datastax.spark.connector.mapper.ReflectionColumnMapper.<init>(ReflectionColumnMapper.scala:38)
    at com.datastax.spark.connector.mapper.JavaBeanColumnMapper.<init>(JavaBeanColumnMapper.scala:10)
    at com.datastax.spark.connector.util.JavaApiHelper$.javaBeanColumnMapper(JavaApiHelper.scala:93)
    at com.datastax.spark.connector.util.JavaApiHelper.javaBeanColumnMapper(JavaApiHelper.scala)
    at com.datastax.spark.connector.japi.CassandraJavaUtil.mapToRow(CassandraJavaUtil.java:1204)
    at com.datastax.spark.connector.japi.CassandraJavaUtil.mapToRow(CassandraJavaUtil.java:1222)
    at muhrafifm.spark.cass.twitterdw.Demo.generateData(Demo.java:69)
    at muhrafifm.spark.cass.twitterdw.Demo.run(Demo.java:35)
    at muhrafifm.spark.cass.twitterdw.Demo.main(Demo.java:181)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.catalyst.package$ScalaReflectionLock$
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    ... 11 more
16/03/04 13:29:40 INFO CassandraConnector: Disconnected from Cassandra cluster: Test Cluster
16/03/04 13:29:41 INFO SparkContext: Invoking stop() from shutdown hook
16/03/04 13:29:41 INFO SparkUI: Stopped Spark web UI at http://10.144.233.28:4040
16/03/04 13:29:41 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/03/04 13:29:42 INFO MemoryStore: MemoryStore cleared
16/03/04 13:29:42 INFO BlockManager: BlockManager stopped
16/03/04 13:29:42 INFO BlockManagerMaster: BlockManagerMaster stopped
16/03/04 13:29:42 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/03/04 13:29:42 INFO SparkContext: Successfully stopped SparkContext
16/03/04 13:29:42 INFO ShutdownHookManager: Shutdown hook called
16/03/04 13:29:42 INFO ShutdownHookManager: Deleting directory /tmp/spark-    16fd2ae2-b61b-4411-a776-1e578caabba6
------------------------------------------------------------------------
BUILD FAILURE
16/03/04 13:29:06信息集群:新增卡桑德拉主机/127.0.0.1:9042
16/03/04 13:29:06信息Cassandra连接器:连接到Cassandra群集:测试群集
线程“main”java.lang.NoClassDefFoundError中出现异常:org/apache/spark/sql/catalyst/package$ScalaReflectionLock$
位于org.apache.spark.sql.catalyst.ReflectionLock$(ReflectionLock.scala:5)
位于org.apache.spark.sql.catalyst.ReflectionLock$(ReflectionLock.scala)
位于com.datastax.spark.connector.mapper.ReflectionColumnMapper。(ReflectionColumnMapper.scala:38)
在com.datastax.spark.connector.mapper.JavaBeanColumnMapper.(JavaBeanColumnMapper.scala:10)
在com.datastax.spark.connector.util.JavaApiHelper$.javaBeanColumnMapper(JavaApiHelper.scala:93)上
位于com.datastax.spark.connector.util.JavaApiHelper.javaBeanColumnMapper(JavaApiHelper.scala)
位于com.datastax.spark.connector.japi.CassandraJavaUtil.mapToRow(CassandraJavaUtil.java:1204)
位于com.datastax.spark.connector.japi.CassandraJavaUtil.mapToRow(CassandraJavaUtil.java:1222)
位于muhrafim.spark.cass.twitterdw.Demo.generateData(Demo.java:69)
在muhrafim.spark.cass.twitterdw.Demo.run(Demo.java:35)
位于muhrafim.spark.cass.twitterdw.Demo.main(Demo.java:181)
原因:java.lang.ClassNotFoundException:org.apache.spark.sql.catalyst.package$ScalaReflectionLock$
位于java.net.URLClassLoader.findClass(URLClassLoader.java:381)
位于java.lang.ClassLoader.loadClass(ClassLoader.java:424)
位于sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
位于java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 还有11个
16/03/04 13:29:40信息Cassandra连接器:已从Cassandra群集断开连接:测试群集
16/03/04 13:29:41信息SparkContext:从关闭挂钩调用stop()
16/03/04 13:29:41信息SparkUI:已在停止Spark web UIhttp://10.144.233.28:4040
16/03/04 13:29:41信息MapOutputRackerMasterEndpoint:MapOutputRackerMasterEndpoint已停止!
16/03/04 13:29:42信息内存存储:内存存储已清除
16/03/04 13:29:42信息区块管理器:区块管理器已停止
16/03/04 13:29:42信息BlockManagerMaster:BlockManagerMaster已停止
16/03/04 13:29:42信息OutputCommitCoordinator$OutputCommitCoordinator返回点:OutputCommitCoordinator已停止!
16/03/04 13:29:42信息SparkContext:已成功停止SparkContext
16/03/04 13:29:42信息关闭挂钩管理器:已调用关闭挂钩
16/03/04 13:29:42信息关机挂钩管理器:删除目录/tmp/spark-16fd2ae2-b61b-4411-a776-1E578CAABA6
------------------------------------------------------------------------
构建失败
我做错什么了吗?似乎需要我甚至不用的软件包,有什么可以解决的吗?或者我应该使用以前版本的卡桑德拉火花连接器

感谢您的回复。

代码正在寻找

org/apache/spark/sql/catalyst/package$ScalaReflectionLock$

因此,您应该包括spark sql库,该库具有正确的依赖关系。

这是我用于此应用程序的POM,它在运行时完全没有任何问题(java版本“1.8.0131”&javac 1.8.0131)。 完整的应用程序可以在这里找到。



4.0.0
火花卡桑德拉流媒体
火花卡桑德拉流媒体
0.0.1-快照
org.apache.spark
spark-core_2.11
2.2.0
假如
org.apache.spark
spark-2.10
2.2.0
假如
com.datasax.spark
spark-cassandra-connector_2.11
2.0.5
com.datasax.spark
spark-cassandra-connector-java_2.10
1.6.0-M1
com.datasax.cassandra
卡桑德拉驱动核心
3.3.0
org.apache.spark
spark-catalyst_2.10
2.2.0
org.apache.spark
spark-sql_2.10
2.2.0
${basedir}/src/main/resources
org.apache.maven.plugins
maven编译器插件
3.6.2
1.8
1.8

我也有同样的问题,问题是Spark版本和Spark Cassandra连接器之间的兼容性。 我使用的是spark 2.3,Cassandra连接器是一个旧版本

此处提供了版本兼容性矩阵:


我能成功地做到这一点

我的scala版本是2.11.12

下面是我的
pom.xml

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.sashi</groupId>
    <artifactId>SalesAnalysis</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <packaging>jar</packaging>

    <name>SalesAnalysis</name>
    <url>http://maven.apache.org</url>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>

    <dependencies>
        <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11 -->
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.11</artifactId>
            <version>2.2.0</version>
            <scope>provided</scope>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.11</artifactId>
            <version>2.2.0</version>
        </dependency> 

        <!-- https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector_2.11 -->
        <dependency>
            <groupId>com.datastax.spark</groupId>
            <artifactId>spark-cassandra-connector_2.11</artifactId>
            <version>2.0.5</version>
        </dependency>               


        <dependency>
            <groupId>com.datastax.cassandra</groupId>
            <artifactId>cassandra-driver-core</artifactId>
            <version>3.3.0</version>
        </dependency>

        <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-catalyst_2.10 -->
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-catalyst_2.11</artifactId>
            <version>2.2.0</version>
        </dependency>

        <dependency>
            <groupId>io.netty</groupId>
            <artifactId>netty-all</artifactId>
            <version>4.1.17.Final</version>
        </dependency>

    </dependencies>
</project>

我已经检查了依赖项,并且有spark sql库,我已经尝试导入它,但它没有任何作用。如何包含并确保包含库?要么将库放入工作人员的lib目录,要么在spark submit命令中将其作为--jars选项发送给他们是的,要么存在范围内没有spark sql的问题,要么spark和spark cassandra版本不兼容。
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>spark-cassandra-streaming</groupId>
    <artifactId>spark-cassandra-streaming</artifactId>
    <version>0.0.1-SNAPSHOT</version>

    <dependencies>

        <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11 -->
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.11</artifactId>
            <version>2.2.0</version>
            <scope>provided</scope>
        </dependency>

        <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming_2.10 -->
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-streaming_2.10</artifactId>
            <version>2.2.0</version>
            <scope>provided</scope>
        </dependency>

        <!-- https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector_2.11 -->
        <dependency>
            <groupId>com.datastax.spark</groupId>
            <artifactId>spark-cassandra-connector_2.11</artifactId>
            <version>2.0.5</version>
        </dependency>

        <!-- https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector-java_2.10 -->
        <dependency>
            <groupId>com.datastax.spark</groupId>
            <artifactId>spark-cassandra-connector-java_2.10</artifactId>
            <version>1.6.0-M1</version>
        </dependency>

        <!-- https://mvnrepository.com/artifact/com.datastax.cassandra/cassandra-driver-core -->
        <dependency>
            <groupId>com.datastax.cassandra</groupId>
            <artifactId>cassandra-driver-core</artifactId>
            <version>3.3.0</version>
        </dependency>

        <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-catalyst_2.10 -->
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-catalyst_2.10</artifactId>
            <version>2.2.0</version>
        </dependency>

        <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql_2.10 -->
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.10</artifactId>
            <version>2.2.0</version>
        </dependency>    

    </dependencies>

    <build>
        <resources>
            <resource>
                <directory>${basedir}/src/main/resources</directory>
            </resource>
        </resources>
        <pluginManagement>
            <plugins>
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-compiler-plugin</artifactId>
                    <version>3.6.2</version>
                    <configuration>
                        <source>1.8</source>
                        <target>1.8</target>
                    </configuration>
                </plugin>
            </plugins>
        </pluginManagement>
    </build>
</project>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.sashi</groupId>
    <artifactId>SalesAnalysis</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <packaging>jar</packaging>

    <name>SalesAnalysis</name>
    <url>http://maven.apache.org</url>

    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>

    <dependencies>
        <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11 -->
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.11</artifactId>
            <version>2.2.0</version>
            <scope>provided</scope>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.11</artifactId>
            <version>2.2.0</version>
        </dependency> 

        <!-- https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector_2.11 -->
        <dependency>
            <groupId>com.datastax.spark</groupId>
            <artifactId>spark-cassandra-connector_2.11</artifactId>
            <version>2.0.5</version>
        </dependency>               


        <dependency>
            <groupId>com.datastax.cassandra</groupId>
            <artifactId>cassandra-driver-core</artifactId>
            <version>3.3.0</version>
        </dependency>

        <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-catalyst_2.10 -->
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-catalyst_2.11</artifactId>
            <version>2.2.0</version>
        </dependency>

        <dependency>
            <groupId>io.netty</groupId>
            <artifactId>netty-all</artifactId>
            <version>4.1.17.Final</version>
        </dependency>

    </dependencies>
</project>
spark-submit --class com.sashi.SalesAnalysis.CassandraSparkSalesAnalysis --packages com.datastax.spark:spark-cassandra-connector_2.11:2.4.0 /home/cloudera/Desktop/spark_ex/Cassandra/sales-analysis.jar