Date 在配置单元表中创建日期数据类型为的列

Date 在配置单元表中创建日期数据类型为的列,date,timestamp,hive,Date,Timestamp,Hive,我已使用以下值在配置单元(0.10.0)中创建了表: 2012-01-11 17:51 Stockton Children's Clothing 168.68 Cash 2012-01-11 17:51 Tampa Health and Beauty 441.08 Amex ............ 这里的日期和时间是选项卡分隔的值,我需要处理日期列,因为配置单元不允许“日期”数据类型,所以我对第一个日期列使用了“时间戳”(2012-01-

我已使用以下值在配置单元(0.10.0)中创建了表:

2012-01-11  17:51   Stockton    Children's Clothing     168.68  Cash
2012-01-11  17:51   Tampa       Health and Beauty       441.08  Amex
............
这里的日期和时间是选项卡分隔的值,我需要处理日期列,因为配置单元不允许“日期”数据类型,所以我对第一个日期列使用了“时间戳”(2012-01-11,…), 然而,在创建表之后,它会显示第一列的空值


如何解决这个问题?请指导。

我将数据加载到一个表中,其中所有列定义为
字符串
,然后将日期值强制转换,并加载到另一个表中,该表中的列定义为
日期
。它似乎没有任何问题。唯一的区别是我使用的是鲨鱼版的蜂巢,老实说,我不确定实际的蜂巢和鲨鱼蜂巢是否有任何深刻的区别

数据:

hduser2@ws-25:~$ more test.txt 
2010-01-05  17:51   Visakh
2013-02-16  09:31   Nair
[localhost:12345] shark>  create table test_time(dt string, tm string, nm string) row format delimited fields terminated by '\t' stored as textfile;
Time taken (including network latency): 0.089 seconds
[localhost:12345] shark> describe test_time;
dt  string  
tm  string  
nm  string  
Time taken (including network latency): 0.06 seconds
[localhost:12345] shark> load data local inpath '/home/hduser2/test.txt' overwrite into table test_time;                                                   
Time taken (including network latency): 0.124 seconds
[localhost:12345] shark> select * from test_time;
2010-01-05  17:51   Visakh
2013-02-16  09:31   Nair
Time taken (including network latency): 0.397 seconds
[localhost:12345] shark> select cast(dt as date) from test_time;
2010-01-05
2013-02-16
Time taken (including network latency): 0.399 seconds
[localhost:12345] shark> create table test_date as select cast(dt as date) from test_time;
Time taken (including network latency): 0.71 seconds
[localhost:12345] shark> select * from test_date;
2010-01-05
2013-02-16
Time taken (including network latency): 0.366 seconds
[localhost:12345] shark> 
代码:

hduser2@ws-25:~$ more test.txt 
2010-01-05  17:51   Visakh
2013-02-16  09:31   Nair
[localhost:12345] shark>  create table test_time(dt string, tm string, nm string) row format delimited fields terminated by '\t' stored as textfile;
Time taken (including network latency): 0.089 seconds
[localhost:12345] shark> describe test_time;
dt  string  
tm  string  
nm  string  
Time taken (including network latency): 0.06 seconds
[localhost:12345] shark> load data local inpath '/home/hduser2/test.txt' overwrite into table test_time;                                                   
Time taken (including network latency): 0.124 seconds
[localhost:12345] shark> select * from test_time;
2010-01-05  17:51   Visakh
2013-02-16  09:31   Nair
Time taken (including network latency): 0.397 seconds
[localhost:12345] shark> select cast(dt as date) from test_time;
2010-01-05
2013-02-16
Time taken (including network latency): 0.399 seconds
[localhost:12345] shark> create table test_date as select cast(dt as date) from test_time;
Time taken (including network latency): 0.71 seconds
[localhost:12345] shark> select * from test_date;
2010-01-05
2013-02-16
Time taken (including network latency): 0.366 seconds
[localhost:12345] shark> 
如果您使用的是
时间戳
,那么您可以尝试将日期和时间字符串串联起来,然后强制转换它们

create table test_1 as select cast(concat(dt,' ', tm,':00') as string) as ts from test_time;

select cast(ts as timestamp) from test_1;

从直线侧使用load命令对我来说效果很好

数据:

创建表语句:

create table mytime(id string ,t timestamp) row format delimited fields terminated by ',';
load data local inpath '/root/workspace/timedata' overwrite into table mytime;
和加载数据语句:

create table mytime(id string ,t timestamp) row format delimited fields terminated by ',';
load data local inpath '/root/workspace/timedata' overwrite into table mytime;
表结构:

describe mytime;      
+-----------+------------+----------+--+
| col_name  | data_type  | comment  |
+-----------+------------+----------+--+
| id        | string     |          |
| t         | timestamp  |          |
+-----------+------------+----------+--+
查询结果:

select * from mytime;                                                                     
+------------+------------------------+--+
| mytime.id  |       mytime.t        |
+------------+------------------------+--+
| buy        | 1977-03-12 06:30:23.0  |
| sell       | 1989-05-23 07:23:12.0  |
+------------+------------------------+--+
Apache对于查询语言和数据建模(公司数据库表中数据结构的表示)非常重要。 有必要了解数据类型及其在定义表列类型时的用法。 Apache主要有两种类型。他们是, 基本数据类型 复杂数据类型 将讨论复杂的数据类型, 复杂数据类型进一步分为四种类型。下文将对其进行解释

2.1阵列 它是字段的有序集合。 所有字段必须为同一类型 语法:数组

示例:数组(1,4)

2.2地图 它是一个无序的键值对集合。 键必须是基元,值可以是任何类型。 语法:MAP

示例:映射('a',1,'c',3)

2.3结构 它是不同类型元素的集合。 语法:STRUCT

示例:struct('a',1.0)

2.4工会 它是异构数据类型的集合。 语法:UNIONTYPE


示例:create_union(1,'a',63)

是否尝试将列作为字符串加载,然后强制转换为日期..类似于
强制转换(列作为日期)
注意,这只适用于
YYYY-MM-DD
格式…是的,它是yyy-MM-DD格式的,我尝试确定日期,但给出了不同的数据:1969-12-31 16:00:00 09:00圣何塞男装214.05美国运通1969-12-31 16:00:00 09:00沃思堡女装153.57 V 1969年12月31日…似乎日期被重置为从Unix epoch time开始…我相信您正在将数据定义为tab分隔,从而将日期部分加载到一列中,将时间加载到另一列中…是的,日期和时间都是tab分隔的。如何解决此问题?我使用了时间戳,因为hive 0.10.0不支持日期。@ashwini已更新我的答案以将数据加载为
时间戳