Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/cassandra/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark 从spark更新cassandra_Apache Spark_Cassandra_Spark Cassandra Connector - Fatal编程技术网

Apache spark 从spark更新cassandra

Apache spark 从spark更新cassandra,apache-spark,cassandra,spark-cassandra-connector,Apache Spark,Cassandra,Spark Cassandra Connector,我是cassandratfm.foehis中有数据的表 当我第一次从spark向cassandra发送数据时,我使用了以下一组命令: import org.apache.spark.sql.functions._ import com.datastax.spark.connector._ import org.apache.spark.sql.cassandra._ val wkdir="/home/adminbigdata/tablas/" val fileIn= "originales/2

我是cassandra
tfm.foehis
中有数据的表

当我第一次从spark向cassandra发送数据时,我使用了以下一组命令:

import org.apache.spark.sql.functions._
import com.datastax.spark.connector._
import org.apache.spark.sql.cassandra._

val wkdir="/home/adminbigdata/tablas/"
val fileIn= "originales/22_FOEHIS2.csv"
val fileOut= "22_FOEHIS_PRE2"
val fileCQL= "22_FOEHISCQL"

val data = sc.textFile(wkdir + fileIn).filter(!_.contains("----")).map(_.trim.replaceAll(" +", "")).map(_.dropRight(1)).map(_.drop(1)).map(_.replaceAll(",", "")).filter(array => array(6) != "MOBIDI").filter(array => array(17) != "").saveAsTextFile(wkdir + fileOut)
val firstDF = spark.read.format("csv").option("header", "true").option("inferSchema", "true").option("mode", "DROPMALFORMED").option("delimiter", "|").load(wkdir + fileOut)
val columns: Array[String] = firstDF.columns
val reorderedColumnNames: Array[String] = Array("hoclic","hodtac","hohrac","hotpac","honrac","hocdan","hocdrs","hocdsl","hocol","hocpny","hodesf","hodtcl","hodtcm","hodtea","hodtra","hodtrc","hodtto","hodtua","hohrcl","hohrcm","hohrea","hohrra","hohrrc","hohrua","holinh","holinr","honumr","hoobs","hooe","hotdsc","hotour","housca","houscl","houscm","housea","houser","housra","housrc")
val secondDF= firstDF.select(reorderedColumnNames.head, reorderedColumnNames.tail: _*)
secondDF.write.cassandraFormat("foehis", "tfm").save()
但是,当我使用相同的脚本加载新数据时,会出现错误。我不知道怎么了? 这是一条信息:

java.lang.UnsupportedOperationException: 'SaveMode is set to ErrorIfExists and Table
tfm.foehis already exists and contains data.
Perhaps you meant to set the DataFrame write mode to Append?
Example: df.write.format.options.mode(SaveMode.Append).save()" '

错误消息清楚地告诉您需要使用Append模式&显示您可以使用它做什么。在您的例子中,发生这种情况是因为目标表已经存在,并且写入模式设置为“error if exists”。如果仍要写入数据,则代码应如下所示:

import org.apache.spark.sql.SaveMode
secondDF.write.cassandraFormat("foehis", "tfm").mode(SaveMode.Append).save()