Hive 配置单元日期字符串验证

Hive 配置单元日期字符串验证,hive,hql,unix-timestamp,Hive,Hql,Unix Timestamp,我正在尝试检查字符串是否为“YYYYMMDD”的有效日期格式 我正在使用下面的技术。但是对于无效的日期字符串,我得到的是有效的日期结果 我做错了什么 SELECT'20019999',CASE WHEN unix_timestamp('20019999','YYYYMMDD') > 0 THEN 'Good'ELSE 'Bad'END; 首先,您使用了错误的格式 select from_unixtime(unix_timestamp()) as de

我正在尝试检查字符串是否为“YYYYMMDD”的有效日期格式

我正在使用下面的技术。但是对于无效的日期字符串,我得到的是有效的日期结果

我做错了什么

SELECT'20019999',CASE WHEN unix_timestamp('20019999','YYYYMMDD')  > 0 THEN  'Good'ELSE 'Bad'END;

首先,您使用了错误的格式

select  from_unixtime(unix_timestamp())                 as default_format
       ,from_unixtime(unix_timestamp(),'YYYY-MM-DD')    as wrong_format
       ,from_unixtime(unix_timestamp(),'yyyy-MM-dd')    as right_format
;


其次,没有对日期部件范围进行验证。
如果您将日数部分增加1,则它会将您转发到第二天

with t as (select stack(7,'27','28','29','30','31','32','33') as dy)
select  t.dy
       ,from_unixtime(unix_timestamp(concat('2017-02-',t.dy),'yyyy-MM-dd'),'yyyy-MM-dd') as dt

from    t
;

如果您将月份部分增加1,则它会将您转发到下一个月

with t as (select stack(5,'10','11','12','13','14') as mn)
select  t.mn
       ,from_unixtime(unix_timestamp(concat('2017-',t.mn,'-01'),'yyyy-MM-dd'),'yyyy-MM-dd') as dt

from    t
;

即使使用CAST,验证也只在零件范围内进行,而不是在日期本身

select cast('2010-02-32' as date);




这里有一个实现目标的方法:

with t as (select '20019999' as dt)
select  dt  
       ,from_unixtime(unix_timestamp(dt,'yyyyMMdd'),'yyyyMMdd') as double_converted_dt    

       ,case 
            when from_unixtime(unix_timestamp(dt,'yyyyMMdd'),'yyyyMMdd')  = dt 
            then 'Good' 
            else 'Bad' 
        end             as dt_status

from    t
;


@jenesaisquoi,伟大的:-)
+-----+-------------+
| mn  |     dt      |
+-----+-------------+
| 10  | 2017-10-01  |
| 11  | 2017-11-01  |
| 12  | 2017-12-01  |
| 13  | 2018-01-01  |
| 14  | 2018-02-01  |
+-----+-------------+
select cast('2010-02-32' as date);
+-------+
|  _c0  |
+-------+
| NULL  |
+-------+
select cast('2010-02-29' as date);
+-------------+
|     _c0     |
+-------------+
| 2010-03-01  |
+-------------+
with t as (select '20019999' as dt)
select  dt  
       ,from_unixtime(unix_timestamp(dt,'yyyyMMdd'),'yyyyMMdd') as double_converted_dt    

       ,case 
            when from_unixtime(unix_timestamp(dt,'yyyyMMdd'),'yyyyMMdd')  = dt 
            then 'Good' 
            else 'Bad' 
        end             as dt_status

from    t
;
+-----------+----------------------+------------+
|    dt     | double_converted_dt  | dt_status  |
+-----------+----------------------+------------+
| 20019999  | 20090607             | Bad        |
+-----------+----------------------+------------+