Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/394.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在Java中向Cassandra表写入数据帧_Java_Dataframe_Apache Spark_Cassandra_Spark Cassandra Connector - Fatal编程技术网

在Java中向Cassandra表写入数据帧

在Java中向Cassandra表写入数据帧,java,dataframe,apache-spark,cassandra,spark-cassandra-connector,Java,Dataframe,Apache Spark,Cassandra,Spark Cassandra Connector,没有找到我需要的东西。scala和Python中的代码加载。以下是我所拥有的: import org.apache.log4j.Logger; import org.apache.spark.sql.Dataset; import org.apache.spark.sql.Row; public class CassandraWriter { private transient Logger logger = Logger.getLogger(CassandraWriter.class

没有找到我需要的东西。scala和Python中的代码加载。以下是我所拥有的:

import org.apache.log4j.Logger;
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;

public class CassandraWriter {
    private transient Logger logger = Logger.getLogger(CassandraWriter.class);
    private Dataset<Row> hdfsDF;

    public CassandraWriter(Dataset<Row> dataFrame) {
        hdfsDF = dataFrame;
    }

    public void writeToCassandra(String tableName, String keyspace) {
        logger.info("Writing DataFrame to table: " + tableName);

        hdfsDF.write().format("org.apache.spark.sql.cassandra").mode("overwrite")
                .option("table",tableName)
                .option("keyspace",keyspace)
                .save();

        logger.info("Inserted DataFrame to Cassandra successfully");
    }
}

有什么想法吗?

您需要确保Spark Cassandra连接器包含在您提交的结果jar中

这两种方法都可以通过构建所谓的fatjar来完成,然后提交。例如,下面是示例():

。。。
UTF-8
2.11.12
2.4.4
2.11
2.4.1
1.8
com.datasax.spark
spark-cassandra-connector_${spark.scala.version}
${scc.version}
org.apache.spark
spark-sql{spark.scala.version}
${spark.version}
假如
...
org.apache.maven.plugins
maven汇编插件
3.2.0
带有依赖项的jar
包裹
单一的

或者,您可以通过
--packages com.datastax.spark:spark-cassandra-connector_2.11:2.4.2

将spark-cassandra-connector指定为包,该项目包括哪些依赖项?spark-hive_2.10 spark-mllib-local_2.10 spark-core_2.10 spark-sql_2.10 spark-cassandra-connector_2.11可能在我的spark提交命令中?我需要在spark submit命令中调用cassandra吗?错误表明类路径中缺少类。查看一些Spark示例(),其中包括import
import org.apache.Spark.sql.cassandra.\u
import org.apache.Spark.sql.cassandra.*;在我的java类的顶部。这在我的IntelliJ中变灰了,因为它没有被使用。这已经成功了!!现在得到一个新的错误。我和以前一样,正在创建一个Spark会话。读取数据帧。然后我将创建一个连接到Cassandra的新会话。这管用!谢天谢地我接通了。但是,当我尝试写入表时,我在线程“main”java.io.IOException中得到异常:未能在{169.91.111.198}:9042处打开到Cassandra的本机连接,原因是:com.datastax.driver.core.exceptions.NoHostAvailableException:所有尝试查询的主机都失败(尝试:/169.91.111.198:9042)(com.datastax.driver.core.exceptions.TransportException:[/169.91.111.198:9042]检查您是否可以在端口9042上访问节点…您可能有防火墙,或者节点可能会公布错误的地址。。。
Exception in thread "main" java.lang.ClassNotFoundException: Failed to find data source: org.apache.spark.sql.cassandra. Please find packages at http://spark.apache.org/third-party-projects.html