Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop 使用Sqoop执行Oozie工作流时出错_Hadoop_Hive_Sqoop_Oozie - Fatal编程技术网

Hadoop 使用Sqoop执行Oozie工作流时出错

Hadoop 使用Sqoop执行Oozie工作流时出错,hadoop,hive,sqoop,oozie,Hadoop,Hive,Sqoop,Oozie,我已经编写了一个Sqoop导入脚本,将数据从teradata导入到hive。 当我从命令行运行它时,它工作得很好,但是当我将它放入shell脚本并尝试通过oozie工作流执行时,我得到了以下错误 `[0000069-150114201015959-oozie-oozi-W@sqoop-shell] **Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.ShellMain],** exit code [1] 20

我已经编写了一个Sqoop导入脚本,将数据从teradata导入到hive。 当我从命令行运行它时,它工作得很好,但是当我将它放入shell脚本并尝试通过oozie工作流执行时,我得到了以下错误

`[0000069-150114201015959-oozie-oozi-W@sqoop-shell] **Launcher ERROR,  reason: Main class [org.apache.oozie.action.hadoop.ShellMain],** exit code [1]
2015-04-02 08:50:55,440  INFO ActionEndXCommand:539 - USER[qjdht93] GROUP[-TOKEN[] APP[sqoop-shell-wf] JOB[0000069-150114201015959-oozie-oozi-W] ACTION[0000069-150114201015959-oozie-oozi-W@sqoop-shell] end executor for wf action 0000069-150114201015959-oozie-oozi-W with wf job 0000069-150114201015959-oozie-oozi-W
2015-04-02 08:50:55,459  INFO ActionEndXCommand:539 - USER[qjdht93] GROUP[-] TOKEN[] APP[sqoop-shell-wf] JOB[0000069-150114201015959-oozie-oozi-W] ACTION[0000069-150114201015959-oozie-oozi-W@sqoop-shell] ERROR is considered as FAILED for SLA
2015-04-02 08:50:55,505  INFO ActionStartXCommand:539 - USER[qjdht93] GROUP[-] TOKEN[] APP[sqoop-shell-wf] JOB[0000069-150114201015959-oozie-oozi-W] ACTION[0000069-150114201015959-oozie-oozi-W@fail] Start action [0000069-150114201015959-oozie-oozi-W@fail] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2015-04-02 08:50:55,505  WARN ActionStartXCommand:542 - USER[qjdht93] GROUP[-] TOKEN[] APP[sqoop-shell-wf] JOB[0000069-150114201015959-oozie-oozi-W] ACTION[0000069-150114201015959-oozie-oozi-W@fail] [***0000069-150114201015959-oozie-oozi-W@fail***]Action status=DONE
2015-04-02 08:50:55,505  WARN ActionStartXCommand:542 - USER[qjdht93] GROUP[-] TOKEN[] APP[sqoop-shell-wf] JOB[0000069-150114201015959-oozie-oozi-W] ACTION[0000069-150114201015959-oozie-oozi-W@fail] [***0000069-150114201015959-oozie-oozi-W@fail***]Action updated in DB!
2015-04-02 08:50:55,522  INFO ActionEndXCommand:539 - USER[qjdht93] GROUP[-] TOKEN[] APP[sqoop-shell-wf] JOB[0000069-150114201015959-oozie-oozi-W] ACTION[0000069-150114201015959-oozie-oozi-W@fail] end executor for wf action 0000069-150114201015959-oozie-oozi-W with wf job 0000069-150114201015959-oozie-oozi-W
2015-04-02 08:50:55,556  WARN CoordActionUpdateXCommand:542 - USER[qjdht93] GROUP[-] TOKEN[] APP[sqoop-shell-wf] JOB[0000069-150114201015959-oozie-oozi-W] ACTION[-] **E1100: Command precondition does not hold before execution, [, coord action is null], Error Code: E1100**`
下面是我的workflow.xml

`<workflow-app xmlns='uri:oozie:workflow:0.3' name='sqoop-shell-wf'>
<start to='sqoop-shell' />
<action name='sqoop-shell'>
    <shell xmlns="uri:oozie:shell-action:0.1">
       <job-tracker>${resourceManager}</job-tracker>
    <name-node>${nameNode}</name-node>
    <configuration>
        <property>
          <name>mapred.job.queue.name</name>
          <value>${queueName}</value>
        </property>
    </configuration>
    <exec>Sqoopscript.sh</exec>
    <file>Sqoopscript.sh#script.sh</file>
</shell>
<ok to="end" />
<error to="fail" />
</action>
<kill name="fail">
<message>Script failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>

看起来您正在使用一个shell操作,而您可以/应该使用Sqoop操作。在运行Sqoop作业时,需要包括几个JAR。其中大部分都包含在oozie共享库中。以下是一个工作流示例:

<workflow-app name="sqoop-to-hive" xmlns="uri:oozie:workflow:0.4">
    <start to="sqoop2hive"/>
    <action name="sqoop2hive">
        <sqoop xmlns="uri:oozie:sqoop-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <command>import --connect jdbc:mysql://mysql.example.com/sqoop --username sqoop --password sqoop --table test --hive-import --hive-table test</command>
            <archive>/tmp/mysql-connector-java-5.1.31-bin.jar#mysql-connector-java-5.1.31-bin.jar</archive>
            <file>/tmp/hive-site.xml#hive-site.xml</file>
        </sqoop>
        <ok to="end"/>
        <error to="kill"/>
    </action>
    <kill name="kill">
        <message>Action failed</message>
    </kill>
    <end name="end"/>
</workflow-app>

关于如何使用oozie和sqoop,我建议您遵循以下步骤。

嘿,谢谢!!。。实际上,我只需要在shell中编写sqoop命令,因为它将把参数文件传递给shell,该文件将包含sqoop所需的信息,如IP、tablename、DBname等。然后清理hdfs目录并执行错误处理。Hi更改了我的workflow.xml,正如您建议的那样,但得到了相同的错误启动器错误,原因:Main class[org.apache.oozie.action.hadoop.SqoopMain],main抛出异常您可能需要在job.properties中指定oozie.use.system.libpath=true。我可以在HDFS中看到导入的数据,但oozie sqoop job无法将其移动到配置单元表。我在job.properties文件中指定了oozie.use.system.libpath=true
<workflow-app name="sqoop-to-hive" xmlns="uri:oozie:workflow:0.4">
    <start to="sqoop2hive"/>
    <action name="sqoop2hive">
        <sqoop xmlns="uri:oozie:sqoop-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <command>import --connect jdbc:mysql://mysql.example.com/sqoop --username sqoop --password sqoop --table test --hive-import --hive-table test</command>
            <archive>/tmp/mysql-connector-java-5.1.31-bin.jar#mysql-connector-java-5.1.31-bin.jar</archive>
            <file>/tmp/hive-site.xml#hive-site.xml</file>
        </sqoop>
        <ok to="end"/>
        <error to="kill"/>
    </action>
    <kill name="kill">
        <message>Action failed</message>
    </kill>
    <end name="end"/>
</workflow-app>