Scala 如何在不添加新列的情况下,在同一数据帧中从\u unixtime转换为\u utc\u时间戳?

Scala 如何在不添加新列的情况下,在同一数据帧中从\u unixtime转换为\u utc\u时间戳?,scala,apache-spark,apache-spark-sql,Scala,Apache Spark,Apache Spark Sql,var columnnames=“callStart\u t,callend\u t”//Timestamp列名是动态输入 scala> df1.show() +------+------------+--------+----------+ | name| callStart_t|personid| callend_t| +------+------------+--------+----------+ | Bindu|1080602418 | 2|1080602419

var columnnames=“callStart\u t,callend\u t”//Timestamp列名是动态输入

 scala> df1.show()
+------+------------+--------+----------+
|  name| callStart_t|personid| callend_t|
+------+------------+--------+----------+
| Bindu|1080602418  |       2|1080602419|
|Raphel|1647964576  |       5|1647964576|
|   Ram|1754536698  |       9|1754536699|
+------+------------+--------+----------+
我试过的代码:

val newDf = df1.withColumn("callStart_Time", to_utc_timestamp(from_unixtime($"callStart_t"/1000,"yyyy-MM-dd hh:mm:ss"),"Europe/Berlin"))

 val newDf = df1.withColumn("callend_Time", to_utc_timestamp(from_unixtime($"callend_t"/1000,"yyyy-MM-dd hh:mm:ss"),"Europe/Berlin"))
在这里,我不希望新列(从unixtime转换为utc时间戳)转换为我要转换的现有列本身

示例输出

+------+---------------------+--------+--------------------+
|  name| callStart_t         |personid| callend_t          |
+------+---------------------+--------+--------------------+
| Bindu|1970-01-13 04:40:02  |       2|1970-01-13 04:40:02 |
|Raphel|1970-01-20 06:16:04  |       5|1970-01-20 06:16:04 |
|   Ram|1970-01-21 11:52:16  |       9|1970-01-21 11:52:16 |
+------+---------------------+--------+--------------------+
注意:时间戳列名是动态的


如何动态获取每个列

只需为列使用相同的名称,它就会替换它:

val newDf = df1.withColumn("callStart_t", to_utc_timestamp(from_unixtime($"callStart_t"/1000,"yyyy-MM-dd hh:mm:ss"),"Europe/Berlin"))
val newDf = df1.withColumn("callend_t", to_utc_timestamp(from_unixtime($"callend_t"/1000,"yyyy-MM-dd hh:mm:ss"),"Europe/Berlin"))
要使其动态化,只需使用相关字符串。例如:

val colName = "callend_t"
val newDf = df.withColumn(colName , to_utc_timestamp(from_unixtime(col(colName)/1000,"yyyy-MM-dd hh:mm:ss"),"Europe/Berlin"))
对于多个列,您可以执行以下操作:

val columns=Seq("callend_t", "callStart_t")
val newDf = columns.foldLeft(df1){ case (curDf, colName) => curDf.withColumn(colName , to_utc_timestamp(from_unixtime(col(colName)/1000,"yyyy-MM-dd hh:mm:ss"),"Europe/Berlin"))}

注:如评论中所述,不需要除以1000。

我不需要新列。此操作不添加新列。它取代了现有的。只需执行newDf.show并查看结果即可工作谢谢。@Assaf。。不需要除以1000,该值表示纪元correctly@stack0114106您可能是对的,但是我只是复制了OP的基本代码,并演示了如何按照建议的方式使用它。我假设基本转换是基于他们的需要。