String 将PySpark中的字符串更改为HH:MM:SS
我有“分钟”专栏。我想在PySpark中将列更改为hh:mm:ss格式 输入:String 将PySpark中的字符串更改为HH:MM:SS,string,time,pyspark,type-conversion,String,Time,Pyspark,Type Conversion,我有“分钟”专栏。我想在PySpark中将列更改为hh:mm:ss格式 输入: minutes(string type) 10 20 70 90 输出: minutes(string type) min_change 10 00:10:00 20 00:20:00 70 01:10:00 90
minutes(string type)
10
20
70
90
输出:
minutes(string type) min_change
10 00:10:00
20 00:20:00
70 01:10:00
90 01:30:00
添加一列
亮起(“00:00:00”)
并将其强制转换为时间戳。将分钟
转换为秒,并将其添加到时间戳列。最后,使用date\u format()
获取所需的格式:
from pyspark.sql.functions import *
from pyspark.sql import functions as F
df.withColumn("minutes", col("minutes").cast("int"))\
.withColumn("min_change", lit("00:00:00").cast("timestamp"))\
.withColumn("min_change", (F.unix_timestamp("min_change") + F.col("minutes")*60).cast('timestamp'))\
.withColumn("min_change", date_format("min_change",'HH:mm:ss')).show()
+-------+----------+
|minutes|min_change|
+-------+----------+
| 10| 00:10:00|
| 20| 00:20:00|
| 70| 01:10:00|
| 90| 01:30:00|
+-------+----------+
添加一列亮起(“00:00:00”)
并将其强制转换为时间戳。将分钟
转换为秒,并将其添加到时间戳列。最后,使用date\u format()
获取所需的格式:
from pyspark.sql.functions import *
from pyspark.sql import functions as F
df.withColumn("minutes", col("minutes").cast("int"))\
.withColumn("min_change", lit("00:00:00").cast("timestamp"))\
.withColumn("min_change", (F.unix_timestamp("min_change") + F.col("minutes")*60).cast('timestamp'))\
.withColumn("min_change", date_format("min_change",'HH:mm:ss')).show()
+-------+----------+
|minutes|min_change|
+-------+----------+
| 10| 00:10:00|
| 20| 00:20:00|
| 70| 01:10:00|
| 90| 01:30:00|
+-------+----------+
请分享你的代码。感谢您的考虑,请分享您的代码。谢谢考虑