Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/visual-studio-code/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hive 如何将包含时间字符串值的csv文件加载到配置单元中的时间戳_Hive_Hiveql - Fatal编程技术网

Hive 如何将包含时间字符串值的csv文件加载到配置单元中的时间戳

Hive 如何将包含时间字符串值的csv文件加载到配置单元中的时间戳,hive,hiveql,Hive,Hiveql,我有一个以下格式的数据集 2019-10-01 00:00:00 UTC,cart,5773203,1487580005134238553,,runail,2.62,463240011,26dd6e6e-4dac-4778-8d2c-92e149dab885 2019-10-01 00:00:03 UTC,cart,5773353,1487580005134238553,,runail,2.62,463240011,26dd6e6e-4dac-4778-8d2c-92e149dab885 201

我有一个以下格式的数据集

2019-10-01 00:00:00 UTC,cart,5773203,1487580005134238553,,runail,2.62,463240011,26dd6e6e-4dac-4778-8d2c-92e149dab885
2019-10-01 00:00:03 UTC,cart,5773353,1487580005134238553,,runail,2.62,463240011,26dd6e6e-4dac-4778-8d2c-92e149dab885
2019-10-01 00:00:07 UTC,cart,5881589,2151191071051219817,,lovely,13.48,429681830,49e8d843-adf3-428b-a2c3-fe8bc6a307c9
2019-10-01 00:00:07 UTC,cart,5723490,1487580005134238553,,runail,2.62,463240011,26dd6e6e-4dac-4778-8d2c-92e149dab885
我已经创建了一个表来将数据加载到表中

create table if not exists product_data (event_time string,event_type string,product_id string,category_id string,category_code string,brand string,price float,user_id bigint,user_session string) row format delimited fields terminated by ',' lines terminated by '\n' tblproperties("skip.header.line.count"="1");
是否可以直接将事件\ U时间字段作为时间戳值加载?
对于配置单元的新手和任何帮助都将不胜感激

因为您正在使用HDFS中的原始文件将数据加载到配置单元中,建议的方法是首先创建外部表,将所有字段作为字符串数据类型。获得外部表后,将数据加载到定义了模式的具体化表中。这两步方法将有助于确保在从文件加载期间不会丢失信息

步骤1:创建外部表:

create external table if not exists product_data_external_table
(
 event_time string,
 event_type string,
 product_id string,
 category_id string,
 category_code string,
 brand string,
 price string,
 user_id string,
 user_session string
) row format delimited 
fields terminated by ','
lines terminated by '\n'
location '<your hdfs file location>'
tblproperties("skip.header.line.count"="1");
步骤2:从product_data_external_表将记录插入product_数据:

insert into product_data 
select
 cast(from_unixtime(unix_timestamp(event_time,'yyyy-MM-dd HH:mm:ss Z'),'yyyy-MM-dd HH:mm:ss') as timestamp) as event_time,
 event_type,
 product_id,
 category_id,
 category_code,
 brand,
 cast(price as float) as price,
 cast(user_id as bigint) as user_id,
 user_session
from 
 product_data_external_table;

从Hive 1.2.0开始,可以提供附加的SerDe属性timestamp.formats
insert into product_data 
select
 cast(from_unixtime(unix_timestamp(event_time,'yyyy-MM-dd HH:mm:ss Z'),'yyyy-MM-dd HH:mm:ss') as timestamp) as event_time,
 event_type,
 product_id,
 category_id,
 category_code,
 brand,
 cast(price as float) as price,
 cast(user_id as bigint) as user_id,
 user_session
from 
 product_data_external_table;