Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala Spark dataframe将整数转换为时间戳并查找日期差异_Scala_Apache Spark - Fatal编程技术网

Scala Spark dataframe将整数转换为时间戳并查找日期差异

Scala Spark dataframe将整数转换为时间戳并查找日期差异,scala,apache-spark,Scala,Apache Spark,我有这个数据框架org.apache.spark.sql.DataFrame: |-- timestamp: integer (nullable = true) |-- checkIn: string (nullable = true) | timestamp| checkIn| +----------+----------+ |1521710892|2018-05-19| |1521710892|2018-05-19| 所需结果:获取日期签入日期和时间戳日期差的新列(2018-03-0

我有这个数据框架
org.apache.spark.sql.DataFrame

|-- timestamp: integer (nullable = true)
|-- checkIn: string (nullable = true)

| timestamp|   checkIn|
+----------+----------+
|1521710892|2018-05-19|
|1521710892|2018-05-19|
所需结果:获取日期
签入日期
和时间戳
日期差的新列(2018-03-03 23:59:59和2018-03-04 00:00:01应相差1)

因此,我需要

  • 将时间戳转换为日期(这就是我被卡住的地方)
  • 从另一个日期中删除一个日期
  • 使用某些函数提取日期(尚未找到此函数)

您可以使用
from_unixtime
将时间戳转换为日期,并使用
datediff
计算天数差:

val df = Seq(
  (1521710892, "2018-05-19"),
  (1521730800, "2018-01-01")
).toDF("timestamp", "checkIn")

df.withColumn("tsDate", from_unixtime($"timestamp")).
  withColumn("daysDiff", datediff($"tsDate", $"checkIn")).
  show

// +----------+----------+-------------------+--------+
// | timestamp|   checkIn|             tsDate|daysDiff|
// +----------+----------+-------------------+--------+
// |1521710892|2018-05-19|2018-03-22 02:28:12|     -58|
// |1521730800|2018-01-01|2018-03-22 08:00:00|      80|
// +----------+----------+-------------------+--------+

from_unixtime
转换为字符串,因此在较大数据的情况下,您几乎肯定最好将数字列转换为“时间戳”,例如:
scala df.withColumn(“tsDate”,col(“timestamp”).cast(TimestampType))