Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala 向dataframe spark中的列添加时间间隔_Scala_Apache Spark - Fatal编程技术网

Scala 向dataframe spark中的列添加时间间隔

Scala 向dataframe spark中的列添加时间间隔,scala,apache-spark,Scala,Apache Spark,下面是我的数据框架 import spark.implicits._ val lastRunDtDF = sc.parallelize(Seq( (1, 2,"2019-07-18 13:34:24") )).toDF("id", "cnt","run_date") lastRunDtDF.show +---+---+-------

下面是我的数据框架

    import spark.implicits._
    val lastRunDtDF = sc.parallelize(Seq(
                                (1, 2,"2019-07-18 13:34:24") 
                                )).toDF("id", "cnt","run_date")

    lastRunDtDF.show

    +---+---+-------------------+
    | id|cnt|           run_date|
    +---+---+-------------------+
    |  1|  2|2019-07-18 13:34:24|
    +---+---+-------------------+
我想创建一个新的dataframe,在现有的run_date列中添加2分钟,将新列作为new_run_date。示例输出如下所示

    +---+---+-------------------+-------------------+
    | id|cnt|           run_date|       new_run_date|
    +---+---+-------------------+-------------------+
    |  1|  2|2019-07-18 13:34:24|2019-07-18 13:36:24|
    +---+---+-------------------+-------------------+
我正在尝试下面的东西

  lastRunDtDF.withColumn("new_run_date",lastRunDtDF("run_date")+"INTERVAL 2 MINUTE")

看来这不是正确的方法。提前感谢您的帮助。

尝试在
expr
功能中包装
间隔2分钟

import org.apache.spark.sql.functions.expr
lastRunDtDF.withColumn("new_run_date",lastRunDtDF("run_date") + expr("INTERVAL 2 MINUTE"))
           .show()
结果:

(或)

使用from\u unixtime、unix\u时间戳函数:

结果:


尝试在
expr
功能中包装
间隔2分钟

import org.apache.spark.sql.functions.expr
lastRunDtDF.withColumn("new_run_date",lastRunDtDF("run_date") + expr("INTERVAL 2 MINUTE"))
           .show()
结果:

(或)

使用from\u unixtime、unix\u时间戳函数:

结果:

import org.apache.spark.sql.functions._

lastRunDtDF.selectExpr("*","from_unixtime(unix_timestamp(run_date) + 2*60,
                            'yyyy-MM-dd HH:mm:ss') as new_run_date")
           .show()
+---+---+-------------------+-------------------+
| id|cnt|           run_date|       new_run_date|
+---+---+-------------------+-------------------+
|  1|  2|2019-07-18 13:34:24|2019-07-18 13:36:24|
+---+---+-------------------+-------------------+