Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark spark的配置单元外部表中的分区列_Apache Spark_Hive_Apache Spark Sql - Fatal编程技术网

Apache spark spark的配置单元外部表中的分区列

Apache spark spark的配置单元外部表中的分区列,apache-spark,hive,apache-spark-sql,Apache Spark,Hive,Apache Spark Sql,从spark创建带有分区的外部表 import org.apache.spark.sql.SparkSession import org.apache.spark.sql.SaveMode val spark = SparkSession.builder().master("local[*]").appName("splitInput").enableHiveSupport().getOrCreate() val sparkDf = spark.read.option("header",

从spark创建带有分区的外部表

import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.SaveMode


 val spark = SparkSession.builder().master("local[*]").appName("splitInput").enableHiveSupport().getOrCreate()

val sparkDf = spark.read.option("header","true").option("inferSchema","true").csv("input/candidate/event=ABCD/CandidateScheduleData_3007_2018.csv")

var newDf = sparkDf 
for(col <- sparkDf.columns){    newDf = newDf.withColumnRenamed(col,col.replaceAll("\\s", "_"))  }

newDf.write.mode(SaveMode.Overwrite).option("path","/output/candidate/event=ABCD/").partitionBy("CenterCode","ExamDate").saveAsTable("abc.candidatelist")

如何用ExamDate格式的
-
替换
%2F

%2F
是百分比编码的
/
。这意味着数据的格式正好是2018年7月30日。您可以:

  • 使用指定的格式将其解析为\u date
  • 使用所需格式手动设置列的格式
 ExamDate=30%2F07%2F2018 instead of ExamDate=30-07-2018