Apache spark 链接数据帧函数调用

Apache spark 链接数据帧函数调用,apache-spark,apache-spark-sql,Apache Spark,Apache Spark Sql,以下代码不起作用: val newDF = df .withColumn("timestamp", when(df("processingDate").isNull, lit(new Timestamp(System.currentTimeMillis))).otherwise(df("processingDate"))) .withColumn("year", year(df("timestamp"))) .withColumn("m

以下代码不起作用:

val newDF = df
          .withColumn("timestamp", when(df("processingDate").isNull, lit(new Timestamp(System.currentTimeMillis))).otherwise(df("processingDate")))
          .withColumn("year", year(df("timestamp")))
          .withColumn("month", month(df("timestamp")))
          .withColumn("day", dayofmonth(df("timestamp")))
如果我运行它,我将得到以下异常:

Exception in thread "main" org.apache.spark.sql.AnalysisException: Cannot resolve column name "timestamp" among ...
问题是,虽然我添加了“timestamp”作为一列,但它不是原始的、不变的“df”的一部分

是否有方法引用调用链中的上一个数据帧

我会将我的代码更新为以下内容,以便它能够工作,但我想知道是否有更好的方法

val dfWithTimestamp = df.withColumn("timestamp", when(df("monBusinessDateTimestamp").isNull, lit(new Timestamp(System.currentTimeMillis))).otherwise(df("monBusinessDateTimestamp")))

val newDF = dfWithTimestamp
          .withColumn("year", year(dfWithTimestamp("timestamp")))
          .withColumn("month", month(dfWithTimestamp("timestamp")))
          .withColumn("day", dayofmonth(dfWithTimestamp("timestamp")))

我现在不能查,但是

val newDF = df
          .withColumn("timestamp", when(df("processingDate").isNull, lit(new Timestamp(System.currentTimeMillis))).otherwise(df("processingDate")))
          .withColumn("year", year($"timestamp"))
          .withColumn("month", month($"timestamp"))
          .withColumn("day", dayofmonth($"timestamp"))

可能有用。

您能分享您的数据帧的模式吗?非常接近,只需稍加修改即可更新。谢谢你的帮助。