Hadoop 使用Sqoop执行Oozie工作流时出错
我已经编写了一个Sqoop导入脚本,将数据从teradata导入到hive。 当我从命令行运行它时,它工作得很好,但是当我将它放入shell脚本并尝试通过oozie工作流执行时,我得到了以下错误Hadoop 使用Sqoop执行Oozie工作流时出错,hadoop,hive,sqoop,oozie,Hadoop,Hive,Sqoop,Oozie,我已经编写了一个Sqoop导入脚本,将数据从teradata导入到hive。 当我从命令行运行它时,它工作得很好,但是当我将它放入shell脚本并尝试通过oozie工作流执行时,我得到了以下错误 `[0000069-150114201015959-oozie-oozi-W@sqoop-shell] **Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.ShellMain],** exit code [1] 20
`[0000069-150114201015959-oozie-oozi-W@sqoop-shell] **Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.ShellMain],** exit code [1]
2015-04-02 08:50:55,440 INFO ActionEndXCommand:539 - USER[qjdht93] GROUP[-TOKEN[] APP[sqoop-shell-wf] JOB[0000069-150114201015959-oozie-oozi-W] ACTION[0000069-150114201015959-oozie-oozi-W@sqoop-shell] end executor for wf action 0000069-150114201015959-oozie-oozi-W with wf job 0000069-150114201015959-oozie-oozi-W
2015-04-02 08:50:55,459 INFO ActionEndXCommand:539 - USER[qjdht93] GROUP[-] TOKEN[] APP[sqoop-shell-wf] JOB[0000069-150114201015959-oozie-oozi-W] ACTION[0000069-150114201015959-oozie-oozi-W@sqoop-shell] ERROR is considered as FAILED for SLA
2015-04-02 08:50:55,505 INFO ActionStartXCommand:539 - USER[qjdht93] GROUP[-] TOKEN[] APP[sqoop-shell-wf] JOB[0000069-150114201015959-oozie-oozi-W] ACTION[0000069-150114201015959-oozie-oozi-W@fail] Start action [0000069-150114201015959-oozie-oozi-W@fail] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2015-04-02 08:50:55,505 WARN ActionStartXCommand:542 - USER[qjdht93] GROUP[-] TOKEN[] APP[sqoop-shell-wf] JOB[0000069-150114201015959-oozie-oozi-W] ACTION[0000069-150114201015959-oozie-oozi-W@fail] [***0000069-150114201015959-oozie-oozi-W@fail***]Action status=DONE
2015-04-02 08:50:55,505 WARN ActionStartXCommand:542 - USER[qjdht93] GROUP[-] TOKEN[] APP[sqoop-shell-wf] JOB[0000069-150114201015959-oozie-oozi-W] ACTION[0000069-150114201015959-oozie-oozi-W@fail] [***0000069-150114201015959-oozie-oozi-W@fail***]Action updated in DB!
2015-04-02 08:50:55,522 INFO ActionEndXCommand:539 - USER[qjdht93] GROUP[-] TOKEN[] APP[sqoop-shell-wf] JOB[0000069-150114201015959-oozie-oozi-W] ACTION[0000069-150114201015959-oozie-oozi-W@fail] end executor for wf action 0000069-150114201015959-oozie-oozi-W with wf job 0000069-150114201015959-oozie-oozi-W
2015-04-02 08:50:55,556 WARN CoordActionUpdateXCommand:542 - USER[qjdht93] GROUP[-] TOKEN[] APP[sqoop-shell-wf] JOB[0000069-150114201015959-oozie-oozi-W] ACTION[-] **E1100: Command precondition does not hold before execution, [, coord action is null], Error Code: E1100**`
下面是我的workflow.xml
`<workflow-app xmlns='uri:oozie:workflow:0.3' name='sqoop-shell-wf'>
<start to='sqoop-shell' />
<action name='sqoop-shell'>
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${resourceManager}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
</configuration>
<exec>Sqoopscript.sh</exec>
<file>Sqoopscript.sh#script.sh</file>
</shell>
<ok to="end" />
<error to="fail" />
</action>
<kill name="fail">
<message>Script failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
看起来您正在使用一个shell操作,而您可以/应该使用Sqoop操作。在运行Sqoop作业时,需要包括几个JAR。其中大部分都包含在oozie共享库中。以下是一个工作流示例:
<workflow-app name="sqoop-to-hive" xmlns="uri:oozie:workflow:0.4">
<start to="sqoop2hive"/>
<action name="sqoop2hive">
<sqoop xmlns="uri:oozie:sqoop-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<command>import --connect jdbc:mysql://mysql.example.com/sqoop --username sqoop --password sqoop --table test --hive-import --hive-table test</command>
<archive>/tmp/mysql-connector-java-5.1.31-bin.jar#mysql-connector-java-5.1.31-bin.jar</archive>
<file>/tmp/hive-site.xml#hive-site.xml</file>
</sqoop>
<ok to="end"/>
<error to="kill"/>
</action>
<kill name="kill">
<message>Action failed</message>
</kill>
<end name="end"/>
</workflow-app>
关于如何使用oozie和sqoop,我建议您遵循以下步骤。嘿,谢谢!!。。实际上,我只需要在shell中编写sqoop命令,因为它将把参数文件传递给shell,该文件将包含sqoop所需的信息,如IP、tablename、DBname等。然后清理hdfs目录并执行错误处理。Hi更改了我的workflow.xml,正如您建议的那样,但得到了相同的错误启动器错误,原因:Main class[org.apache.oozie.action.hadoop.SqoopMain],main抛出异常您可能需要在job.properties中指定oozie.use.system.libpath=true。我可以在HDFS中看到导入的数据,但oozie sqoop job无法将其移动到配置单元表。我在job.properties文件中指定了oozie.use.system.libpath=true
<workflow-app name="sqoop-to-hive" xmlns="uri:oozie:workflow:0.4">
<start to="sqoop2hive"/>
<action name="sqoop2hive">
<sqoop xmlns="uri:oozie:sqoop-action:0.2">
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<command>import --connect jdbc:mysql://mysql.example.com/sqoop --username sqoop --password sqoop --table test --hive-import --hive-table test</command>
<archive>/tmp/mysql-connector-java-5.1.31-bin.jar#mysql-connector-java-5.1.31-bin.jar</archive>
<file>/tmp/hive-site.xml#hive-site.xml</file>
</sqoop>
<ok to="end"/>
<error to="kill"/>
</action>
<kill name="kill">
<message>Action failed</message>
</kill>
<end name="end"/>
</workflow-app>