Hadoop 从SQL Server导入,数据类型未正确转换
从SQL Server导入,数据类型未正确转换 堆栈:使用Ambari 2.1安装HDP-2.3.2.0-2950 目标:Hadoop 从SQL Server导入,数据类型未正确转换,hadoop,hive,sqoop,avro,Hadoop,Hive,Sqoop,Avro,从SQL Server导入,数据类型未正确转换 堆栈:使用Ambari 2.1安装HDP-2.3.2.0-2950 目标: 以Avro格式将表从SQL Server导入HDFS 创建包含所有数据的外部配置单元Avro(SerDe)表 创建外部配置单元ORC表并插入ORC select*from Avro表格 放下Avro表并在ORC表上执行测试 其中一个表格: ECU_DTC_ID int DTC_CDE nchar(20) ECU_NAME
- 以Avro格式将表从SQL Server导入HDFS
- 创建包含所有数据的外部配置单元Avro(SerDe)表
- 创建外部配置单元ORC表并插入ORC select*from Avro表格
- 放下Avro表并在ORC表上执行测试
ECU_DTC_ID int
DTC_CDE nchar(20)
ECU_NAME nvarchar(15)
ECU_FAMILY_NAME nvarchar(15)
DTC_DESC nvarchar(MAX)
INSERTED_BY nvarchar(64)
INSERTION_DATE datetime
DTC_CDE_DECIMAL int
当我执行正常的sqoop导入时,datetime被转换为long,nchar和nvarchar转换为String。生成的avsc文件如图所示,当我创建配置单元Avro表时,它不包括生成的Avro文件,因此留下一个空表:
{
"type" : "record",
"name" : "DimECUDTCCode",
"doc" : "Sqoop import of DimECUDTCCode",
"fields" : [ {
"name" : "ECU_DTC_ID",
"type" : [ "null", "int" ],
"default" : null,
"columnName" : "ECU_DTC_ID",
"sqlType" : "4"
}, {
"name" : "DTC_CDE",
"type" : [ "null", "string" ],
"default" : null,
"columnName" : "DTC_CDE",
"sqlType" : "-15"
}, {
"name" : "ECU_NAME",
"type" : [ "null", "string" ],
"default" : null,
"columnName" : "ECU_NAME",
"sqlType" : "-9"
}, {
"name" : "ECU_FAMILY_NAME",
"type" : [ "null", "string" ],
"default" : null,
"columnName" : "ECU_FAMILY_NAME",
"sqlType" : "-9"
}, {
"name" : "DTC_DESC",
"type" : [ "null", "string" ],
"default" : null,
"columnName" : "DTC_DESC",
"sqlType" : "-9"
}, {
"name" : "INSERTED_BY",
"type" : [ "null", "string" ],
"default" : null,
"columnName" : "INSERTED_BY",
"sqlType" : "-9"
}, {
"name" : "INSERTION_DATE",
"type" : [ "null", "long" ],
"default" : null,
"columnName" : "INSERTION_DATE",
"sqlType" : "93"
}, {
"name" : "DTC_CDE_DECIMAL",
"type" : [ "null", "int" ],
"default" : null,
"columnName" : "DTC_CDE_DECIMAL",
"sqlType" : "4"
} ],
"tableName" : "DimECUDTCCode"
我决定包括--映射列java:
sqoop import --connect 'jdbc:sqlserver://somedbserver;database=somedb' --username someusername--password somepassword --as-avrodatafile --num-mappers 8 --table DimECUDTCCode --map-column-java DTC_CDE=string,ECU_NAME=string,ECU_FAMILY_NAME=string,DTC_DESC=string,INSERTED_BY=string,INSERTION_DATE=timestamp --warehouse-dir /dataload/tohdfs/reio/odpdw/may2016 --verbose
但我得到了以下错误:
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 ERROR orm.ClassWriter: No ResultSet method for Java type string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_NAME to string
16/05/12 09:43:12 ERROR orm.ClassWriter: No ResultSet method for Java type string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column ECU_FAMILY_NAME to string
16/05/12 09:43:12 ERROR orm.ClassWriter: No ResultSet method for Java type string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_DESC to string
16/05/12 09:43:12 ERROR orm.ClassWriter: No ResultSet method for Java type string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTED_BY to string
16/05/12 09:43:12 ERROR orm.ClassWriter: No ResultSet method for Java type string
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column INSERTION_DATE to timestamp
16/05/12 09:43:12 ERROR orm.ClassWriter: No ResultSet method for Java type timestamp
16/05/12 09:43:12 INFO orm.ClassWriter: Overriding type of column DTC_CDE to string
16/05/12 09:43:12 ERROR tool.ImportTool: Imported Failed: No ResultSet method for Java type string
[sqoop@l1038lab root]$
我遗漏了什么?结果是,
STRING
和STRING
和STRING
被SQOOP区别对待。正确的方法是String
您可以尝试--映射列配置单元
并直接将SQL Server列映射到配置单元列。但为什么配置单元,我希望使用java,但没有成功是的,您应该尝试使用--映射列java
查找问题。我只是想给你一个选择,如果你被卡住了,因为我尝试了--map column hive
,它成功了。