Hadoop 配置单元-如何在serde中处理多个quoteChar
我有源文件CSV和数据如下所示 “201814”、“39”、“0598824”、“黄色夹套陷阱W”、“皮耶格” GUEP.JAU,OUEST”,“ACT”,“7/20/2016”,“C/EHadoop 配置单元-如何在serde中处理多个quoteChar,hadoop,hive,bigdata,Hadoop,Hive,Bigdata,我有源文件CSV和数据如下所示 “201814”、“39”、“0598824”、“黄色夹套陷阱W”、“皮耶格” GUEP.JAU,OUEST”,“ACT”,“7/20/2016”,“C/E “,”05“,”ST“,”N“,”15“,”2484“,”985.3999999998“,”43.66“,”3762.36“,”53.05“,” “,”N“,”5.83“,”7.9900“,”0000“,”0000“,”3.82“,”3.8181“,”7162“,”英镑 国际标准、D标准、12标准、YJTD-
“,”05“,”ST“,”N“,”15“,”2484“,”985.3999999998“,”43.66“,”3762.36“,”53.05“,” “,”N“,”5.83“,”7.9900“,”0000“,”0000“,”3.82“,”3.8181“,”7162“,”英镑 国际标准、D标准、12标准、YJTD-DB12-W标准、12标准、32标准、0标准、0标准、0标准、0标准、3.68标准、0标准、0标准 要加载数据,我使用下面的create语句和serde
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
"separatorChar" = "|",
"quoteChar" = '\"',
"escapeChar" = '\\')
问题是在“\”之后,文件中存在的任何数据都是空的
你能告诉我怎么处理吗
我的完整DDL使用
CREATE EXTERNAL TABLE
excess_inventory
(
whole_record string,
yyyyww string,
excess_wks_num string,
product_num string,
eng_desc string,
fr_desc string,
status string,
corp_status_change_date string,
whse_region string,
whse_id string,
channel_cd string,
eap_ind string,
fwos string,
non_alloc_qty string,
excess_qty string,
excess_cube string,
excess_inventory_dollars string,
monthly_storage_cost string,
deal_600 string,
go_ind string,
next_5_deals string,
reg_adlr string,
reg_retail string,
r52_best_promo_adlr string,
r52_best_promo_retail string,
landed_cost string,
corp_cost string,
vendor_num string,
vendor_nm string,
vendor_origin string,
vendor_moq string,
vendor_part_num string,
vendor_lead_tm string,
total_lead_tm string,
ingate_qty string,
on_order_qty string,
dealer_restriction_cd string,
quote_cost string,
casting_charge string,
action_cd string,
action_yyyyww string,
action_qty string,
sugg_adlr string,
comments string,
create_yyyyww string,
user_nm string,
batch_ts timestamp
)
PARTITIONED BY (partition_batch_ts bigint)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
"separatorChar" = "|",
"quoteChar" = '\"',
"escapeChar" = '\\')
STORED AS TEXTFILE
LOCATION
'db/excess_inventory/table'
TBLPROPERTIES('skip.header.line.count'='1','serialization.null.format'='');
还要让我知道“separatorChar”=“|”用于表示数据将作为管道分隔符保存在HDFS中,或者我们必须在源文件中指定分隔符 “\”、“3.68”、“0”、“、”、“、”、“、”、“、”、“、”、”、“、”、“、”斜杠后有什么数据没有加载您的分隔符似乎是一个
,
?“separatorChar”=“|”如果我将其设置为“separatorChar”=“,”所有列的数据都将到达单个列