Date 在配置单元表中创建日期数据类型为的列
我已使用以下值在配置单元(0.10.0)中创建了表:Date 在配置单元表中创建日期数据类型为的列,date,timestamp,hive,Date,Timestamp,Hive,我已使用以下值在配置单元(0.10.0)中创建了表: 2012-01-11 17:51 Stockton Children's Clothing 168.68 Cash 2012-01-11 17:51 Tampa Health and Beauty 441.08 Amex ............ 这里的日期和时间是选项卡分隔的值,我需要处理日期列,因为配置单元不允许“日期”数据类型,所以我对第一个日期列使用了“时间戳”(2012-01-
2012-01-11 17:51 Stockton Children's Clothing 168.68 Cash
2012-01-11 17:51 Tampa Health and Beauty 441.08 Amex
............
这里的日期和时间是选项卡分隔的值,我需要处理日期列,因为配置单元不允许“日期”数据类型,所以我对第一个日期列使用了“时间戳”(2012-01-11,…),
然而,在创建表之后,它会显示第一列的空值
如何解决这个问题?请指导。我将数据加载到一个表中,其中所有列定义为
字符串
,然后将日期值强制转换,并加载到另一个表中,该表中的列定义为日期
。它似乎没有任何问题。唯一的区别是我使用的是鲨鱼版的蜂巢,老实说,我不确定实际的蜂巢和鲨鱼蜂巢是否有任何深刻的区别
数据:
hduser2@ws-25:~$ more test.txt
2010-01-05 17:51 Visakh
2013-02-16 09:31 Nair
[localhost:12345] shark> create table test_time(dt string, tm string, nm string) row format delimited fields terminated by '\t' stored as textfile;
Time taken (including network latency): 0.089 seconds
[localhost:12345] shark> describe test_time;
dt string
tm string
nm string
Time taken (including network latency): 0.06 seconds
[localhost:12345] shark> load data local inpath '/home/hduser2/test.txt' overwrite into table test_time;
Time taken (including network latency): 0.124 seconds
[localhost:12345] shark> select * from test_time;
2010-01-05 17:51 Visakh
2013-02-16 09:31 Nair
Time taken (including network latency): 0.397 seconds
[localhost:12345] shark> select cast(dt as date) from test_time;
2010-01-05
2013-02-16
Time taken (including network latency): 0.399 seconds
[localhost:12345] shark> create table test_date as select cast(dt as date) from test_time;
Time taken (including network latency): 0.71 seconds
[localhost:12345] shark> select * from test_date;
2010-01-05
2013-02-16
Time taken (including network latency): 0.366 seconds
[localhost:12345] shark>
代码:
hduser2@ws-25:~$ more test.txt
2010-01-05 17:51 Visakh
2013-02-16 09:31 Nair
[localhost:12345] shark> create table test_time(dt string, tm string, nm string) row format delimited fields terminated by '\t' stored as textfile;
Time taken (including network latency): 0.089 seconds
[localhost:12345] shark> describe test_time;
dt string
tm string
nm string
Time taken (including network latency): 0.06 seconds
[localhost:12345] shark> load data local inpath '/home/hduser2/test.txt' overwrite into table test_time;
Time taken (including network latency): 0.124 seconds
[localhost:12345] shark> select * from test_time;
2010-01-05 17:51 Visakh
2013-02-16 09:31 Nair
Time taken (including network latency): 0.397 seconds
[localhost:12345] shark> select cast(dt as date) from test_time;
2010-01-05
2013-02-16
Time taken (including network latency): 0.399 seconds
[localhost:12345] shark> create table test_date as select cast(dt as date) from test_time;
Time taken (including network latency): 0.71 seconds
[localhost:12345] shark> select * from test_date;
2010-01-05
2013-02-16
Time taken (including network latency): 0.366 seconds
[localhost:12345] shark>
如果您使用的是时间戳
,那么您可以尝试将日期和时间字符串串联起来,然后强制转换它们
create table test_1 as select cast(concat(dt,' ', tm,':00') as string) as ts from test_time;
select cast(ts as timestamp) from test_1;
从直线侧使用load命令对我来说效果很好 数据: 创建表语句:
create table mytime(id string ,t timestamp) row format delimited fields terminated by ',';
load data local inpath '/root/workspace/timedata' overwrite into table mytime;
和加载数据语句:
create table mytime(id string ,t timestamp) row format delimited fields terminated by ',';
load data local inpath '/root/workspace/timedata' overwrite into table mytime;
表结构:
describe mytime;
+-----------+------------+----------+--+
| col_name | data_type | comment |
+-----------+------------+----------+--+
| id | string | |
| t | timestamp | |
+-----------+------------+----------+--+
查询结果:
select * from mytime;
+------------+------------------------+--+
| mytime.id | mytime.t |
+------------+------------------------+--+
| buy | 1977-03-12 06:30:23.0 |
| sell | 1989-05-23 07:23:12.0 |
+------------+------------------------+--+
Apache对于查询语言和数据建模(公司数据库表中数据结构的表示)非常重要。
有必要了解数据类型及其在定义表列类型时的用法。
Apache主要有两种类型。他们是,
基本数据类型
复杂数据类型
将讨论复杂的数据类型,
复杂数据类型进一步分为四种类型。下文将对其进行解释
2.1阵列
它是字段的有序集合。
所有字段必须为同一类型
语法:数组
示例:数组(1,4)
2.2地图
它是一个无序的键值对集合。
键必须是基元,值可以是任何类型。
语法:MAP
示例:映射('a',1,'c',3)
2.3结构
它是不同类型元素的集合。
语法:STRUCT
示例:struct('a',1.0)
2.4工会
它是异构数据类型的集合。
语法:UNIONTYPE
示例:create_union(1,'a',63)是否尝试将列作为字符串加载,然后强制转换为日期..类似于
强制转换(列作为日期)
注意,这只适用于YYYY-MM-DD
格式…是的,它是yyy-MM-DD格式的,我尝试确定日期,但给出了不同的数据:1969-12-31 16:00:00 09:00圣何塞男装214.05美国运通1969-12-31 16:00:00 09:00沃思堡女装153.57 V 1969年12月31日…似乎日期被重置为从Unix epoch time开始…我相信您正在将数据定义为tab分隔,从而将日期部分加载到一列中,将时间加载到另一列中…是的,日期和时间都是tab分隔的。如何解决此问题?我使用了时间戳,因为hive 0.10.0不支持日期。@ashwini已更新我的答案以将数据加载为时间戳