Date 蜂巢中的日期差异,差异应以hh:mm:ss为单位

Date 蜂巢中的日期差异,差异应以hh:mm:ss为单位,date,hadoop,hive,hiveql,Date,Hadoop,Hive,Hiveql,我试图找出连续几行中两个日期之间的差异。我正在使用hive中的窗口功能,即,lag 但不同之处在于,输出的格式应为hh:mm:ss 例如: 日期1是2017-08-15 02:00:32 日期2是2017-08-15 02:00:20 输出应为: 00:00:12 我尝试使用的查询: select from_unixtime(column_name), (lag(unix_timestamp(from_unixtime(column_name)),1,0) over(partition by

我试图找出连续几行中两个日期之间的差异。我正在使用hive中的窗口功能,即,
lag

但不同之处在于,输出的格式应为
hh:mm:ss

例如:

  • 日期1是2017-08-15 02:00:32
  • 日期2是2017-08-15 02:00:20
输出应为:

00:00:12

我尝试使用的查询:

select from_unixtime(column_name),
(lag(unix_timestamp(from_unixtime(column_name)),1,0)
over(partition by column_name)-
unix_timestamp(from_unixtime(column_name))) as Duration from table_name;
但这会将输出返回为
12
(在上面的示例中)

更新 我已经使用bigint数据类型将该列存储在表中。时间是纪元格式的。我们在查询中使用from_unixtime将其转换为可读日期。时间戳中的样本值

1502802618
1502786788

只要时差小于24小时,答案就会相关

hive> with t as (select timestamp '2017-08-15 02:00:32' as ts1,timestamp '2017-08-15 02:00:20' as ts2)
    > select  ts1 - ts2   as diff
    > from    t
    > ;
OK
diff
0 00:00:12.000000000
给定时间戳

hive> with t as (select timestamp '2017-08-15 02:00:32' as ts1,timestamp '2017-08-15 02:00:20' as ts2)
    > select  split(ts1 - ts2,'[ .]')[1]  as diff
    > from    t
    > ;
OK
diff
00:00:12
给定字符串

hive> with t as (select '2017-08-15 02:00:32' as ts1,'2017-08-15 02:00:20' as ts2)
    > select  split(cast(ts1 as timestamp) - cast(ts2 as timestamp),'[ .]')[1]  as diff
    > from    t
    > ;
OK
diff
00:00:12

只要时差小于24小时,答案将是相关的

hive> with t as (select timestamp '2017-08-15 02:00:32' as ts1,timestamp '2017-08-15 02:00:20' as ts2)
    > select  ts1 - ts2   as diff
    > from    t
    > ;
OK
diff
0 00:00:12.000000000
给定时间戳

hive> with t as (select timestamp '2017-08-15 02:00:32' as ts1,timestamp '2017-08-15 02:00:20' as ts2)
    > select  split(ts1 - ts2,'[ .]')[1]  as diff
    > from    t
    > ;
OK
diff
00:00:12
给定字符串

hive> with t as (select '2017-08-15 02:00:32' as ts1,'2017-08-15 02:00:20' as ts2)
    > select  split(cast(ts1 as timestamp) - cast(ts2 as timestamp),'[ .]')[1]  as diff
    > from    t
    > ;
OK
diff
00:00:12

只要时差小于24小时,答案将是相关的

hive> with t as (select 1502802618 as ts1,1502786788 as ts2)
    > select  from_unixtime(to_unix_timestamp('0001-01-01 00:00:00')+(ts1 - ts2))  as diff
    > from    t
    > ;
OK
diff
0001-01-01 04:23:50


只要时差小于24小时,答案将是相关的

hive> with t as (select 1502802618 as ts1,1502786788 as ts2)
    > select  from_unixtime(to_unix_timestamp('0001-01-01 00:00:00')+(ts1 - ts2))  as diff
    > from    t
    > ;
OK
diff
0001-01-01 04:23:50


我试图执行您的查询,但它抛出错误“没有与(timestamp,timestamp)匹配的org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPMinus类的方法”。为了提供更多关于表模式的信息,我在bigint中存储了一列,它在epoch timestamp中。这是一个经过测试的代码。显然,您的Hive版本不支持时间戳减法。我们使用的是Cloudera-5.3.1/Hive-0.13.1-cdh5.3.1。此版本不支持3年前(2014年6月6日)的版本?@Shash?。。。No@Shash,(1)在正确的位置进行注释(2)您谈论的是基本SQL,只需将
ts1
ts2
替换为相关的表达式(列和滞后(…)超过(…))我试图执行您的查询,但它抛出错误“没有与org.apache.hadoop.hive.ql.udf.generic.GenericUDFOPMinus类匹配的方法”(timestamp,timestamp)”。为了提供更多关于表架构的信息,我在bigint中存储了列,它在epoch timestamp中。这是一个经过测试的代码。显然您的配置单元版本不支持时间戳减法。我们使用的是Cloudera-5.3.1/Hive-0.13.1-cdh5.3.1。这个版本不支持?@Shash,三年前的版本(2014年6月6日)?。。。No@Shash,(1)在正确的位置进行注释(2)您正在谈论的是基本SQL,只需将
ts1
ts2
替换为相关的表达式(列和滞后(…)超过(…)