Apache spark Spark结构化流替换列的值

Apache spark Spark结构化流替换列的值,apache-spark,spark-structured-streaming,Apache Spark,Spark Structured Streaming,我有以下数据帧 val tDataJsonDF = kafkaStreamingDFParquet .filter($"value".contains("tUse")) .filter($"value".isNotNull) .selectExpr("cast (value as string) as tdatajson", "cast (topic as string) as env") .select(from_json($"tdatajson", schema =

我有以下数据帧

val tDataJsonDF = kafkaStreamingDFParquet
   .filter($"value".contains("tUse"))
   .filter($"value".isNotNull)
   .selectExpr("cast (value as string) as tdatajson", "cast (topic as string) as env")
   .select(from_json($"tdatajson", schema = ParquetSchema.tSchema).as("data"), $"env".as("env"))
   .select("data.*", "env")
   .select($"date", <--YYYY/MM/dd
           $"time",
           $"event",
           $"serviceGroupId",
           $"userId",
           $"env")

我想你用的是Spark 2.2+

tDataJsonDF.withColumn("formatted_date",date_format(to_date(col("date"), "YYYY/MM/dd"), "yyyy-MM-dd"))
tDataJsonDF.withColumn("formatted_date",date_format(to_date(col("date"), "YYYY/MM/dd"), "yyyy-MM-dd"))