Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/unix/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark 保存为拼花地板并写入红移时Spark持久化数据集_Apache Spark - Fatal编程技术网

Apache spark 保存为拼花地板并写入红移时Spark持久化数据集

Apache spark 保存为拼花地板并写入红移时Spark持久化数据集,apache-spark,Apache Spark,当需要在s3中写入parquets并将其保存到红移时,尝试持久化数据集是否有意义 dataset.write .mode(SaveMode.Overwrite).parquet(s"s3a://s3path") dataset.write.format("com.databricks.spark.redshift") .option("url", redshiftJdbcUrl) .option("dbtable", "table") .option("tempdir",

当需要在s3中写入parquets并将其保存到红移时,尝试持久化数据集是否有意义

dataset.write
    .mode(SaveMode.Overwrite).parquet(s"s3a://s3path")

dataset.write.format("com.databricks.spark.redshift")
  .option("url", redshiftJdbcUrl)
  .option("dbtable", "table")
  .option("tempdir", s"s3a://s3PathTemp")
  .mode("append")
  .save()
在这个例子中,我写入拼花地板,保存到redshift,还需要抓取max id来单独保存在s3中,怎么样

 val lastId: String = dataset.describe("id").filter("summary = 'max'").select("id").collect()(0).getString(0)
 Seq(lastId).toDS().coalesce(1).write
  .mode(SaveMode.Overwrite)
  .text(S3PathMaxId)