Sql 使用勺子导入拼花地板格式的clob数据

Sql 使用勺子导入拼花地板格式的clob数据,sql,hdfs,sqoop,parquet,clob,Sql,Hdfs,Sqoop,Parquet,Clob,我正在尝试以拼花地板格式导入clob数据,下面是我的命令行: sshpass -p ${MDP_MAPR} ssh -n ${USR_MAPR}@${CNX_MAPR} sqoop import -Dmapred.job.queue.name=root.leasing.dev --connect ${CNX_DB} --username ${USR_DB} --password ${MDP_DB} --query "${query}" --delete-target-dir --target-d

我正在尝试以拼花地板格式导入clob数据,下面是我的命令行:

sshpass -p ${MDP_MAPR} ssh -n ${USR_MAPR}@${CNX_MAPR} sqoop import -Dmapred.job.queue.name=root.leasing.dev --connect ${CNX_DB} --username ${USR_DB} --password ${MDP_DB} --query "${query}" --delete-target-dir --target-dir ${DST_HDFS}/${SOURCE}_${table} --hive-overwrite --hive-import --hive-table ${SOURCE}_${table} --hive-database ${DST_HIVE} --hive-drop-import-delims  -m 1 ${DRIVER_DB} --as-parquetfile >>${ficTrace} 2>&1
但它不起作用,我也不知道为什么,这是我从它的执行中得到的日志:

Warning: /opt/mapr/sqoop/sqoop-1.4.6/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/07/09 14:44:42 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-mapr-1703
18/07/09 14:44:42 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/07/09 14:44:42 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
18/07/09 14:44:42 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/mapr/lib/slf4j-log4j12-1.7.12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/mapr/hive/hive-2.1/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
18/07/09 14:44:43 INFO oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop is disabled.
18/07/09 14:44:43 INFO manager.SqlManager: Using default fetchSize of 1000
18/07/09 14:44:43 INFO tool.CodeGenTool: Beginning code generation
18/07/09 14:44:44 INFO manager.OracleManager: Time zone has been set to GMT
18/07/09 14:44:44 INFO manager.SqlManager: Executing SQL statement: select * from doe.DE_DECISIONS where  (1 = 0)
18/07/09 14:44:44 INFO manager.SqlManager: Executing SQL statement: select * from doe.DE_DECISIONS where  (1 = 0)
18/07/09 14:44:44 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/mapr/hadoop/hadoop-2.7.0
Note: /tmp/sqoop-mapr/compile/2b49a98afbeb2ac1135adc84c66cf092/QueryResult.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
18/07/09 14:44:48 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-mapr/compile/2b49a98afbeb2ac1135adc84c66cf092/QueryResult.jar
18/07/09 14:44:53 INFO tool.ImportTool: Destination directory /app/list/datum/data/calf_hors_prod-cluster/datum/dev/leasing/tmp_sqoop/DE_DECISIONS is not present, hence not deleting.
18/07/09 14:44:53 INFO mapreduce.ImportJobBase: Beginning query import.
18/07/09 14:44:53 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
18/07/09 14:44:53 INFO mapreduce.JobBase: Setting default value for hadoop.job.history.user.location=none
18/07/09 14:44:53 INFO manager.OracleManager: Time zone has been set to GMT
18/07/09 14:44:53 INFO manager.SqlManager: Executing SQL statement: select * from doe.DE_DECISIONS where  (1 = 0)
18/07/09 14:44:53 INFO manager.SqlManager: Executing SQL statement: select * from doe.DE_DECISIONS where  (1 = 0)
18/07/09 14:44:54 ERROR tool.ImportTool: Imported Failed: Cannot convert SQL type 2005

谢谢您的帮助。

您可以尝试在Sqoop命令末尾添加以下内容:

--map-column-java <ORACLE_CLOB_COLUMN_NAME>=String
这将为Sqoop提供关于Oracle CLOB类型到Java类型映射的指导

如果有多个列,则可以使用以下语法模式:

--map-column-java <ORACLE_CLOB_COLUMN_NAME_1>=String,<ORACLE_CLOB_COLUMN_NAME_2>=String
--映射列java=String,=String

hi@Jagrut Sharma,但是有了这些,clob将以二进制格式存储,不是吗?我希望它以拼花地板的形式存储。谢谢HDFS上的数据将以拼花地板格式显示。配置单元表将使用CLOB列的字符串类型创建。hi@Jagrut Sharma我添加了:
--map column java=STRING
,然后它没有工作在日志中出现此错误:我有,它没有工作:在日志中出现此错误:sqoop.sqoop:运行sqoop:org.apache.avro.schemaparserception时出现异常:org.codehaus.jackson.JsonParseException:在[Source:java.io]的对象条目内/之间输入意外结束。StringReader@28f6137b;第1行,第6001列]org.apache.avro.SchemaParseException:org.codehaus.jackson.JsonParseException:在[来源:java.io。StringReader@28f6137b;第1行,第6001列]位于org.apache.avro.Schema$Parser.parse(Schema.java:955)该博客中的错误似乎与您得到的不同。TABLE_PARAMS是一个Hive metastore表,其中包含有关表参数(键值对)的详细信息。此导入的配置单元表将包含键为
avro.schema.url
的属性,其中包含HDFS上
.avsc
架构文件的位置。此位置信息不太可能超过
varchar(4000)
PARAM_VALUE列的默认限制。由于样本数据不可用,重现该问题有点困难。我可以成功地对带有CLOB列的测试表运行此操作。
--map-column-java <ORACLE_CLOB_COLUMN_NAME_1>=String,<ORACLE_CLOB_COLUMN_NAME_2>=String