Workflow 如何将hadoop streaming cmdenv与Oozie一起使用?
我有一个Hadoop流媒体作业,参数为:Workflow 如何将hadoop streaming cmdenv与Oozie一起使用?,workflow,hadoop-streaming,oozie,Workflow,Hadoop Streaming,Oozie,我有一个Hadoop流媒体作业,参数为: -cmdenv TEXT_DIR=cachetextdir 如何在Oozie工作流中指定这一点 (我假设我可以在Oozie中使用以下命令指向cachetextdir: <archive>hdfs://localhost:54310/user/vm/textinput/cachetextdir.tar.gz#cachetextdir</archive> hdfs://localhost:54310/user/vm/texti
-cmdenv TEXT_DIR=cachetextdir
如何在Oozie工作流中指定这一点
(我假设我可以在Oozie中使用以下命令指向cachetextdir:
<archive>hdfs://localhost:54310/user/vm/textinput/cachetextdir.tar.gz#cachetextdir</archive>
hdfs://localhost:54310/user/vm/textinput/cachetextdir.tar.gz#cachetextdir
看起来像:
<streaming>
<mapper>[MAPPER-PROCESS]</mapper>
<reducer>[REDUCER-PROCESS]</reducer>
<record-reader>[RECORD-READER-CLASS]</record-reader>
<record-reader-mapping>[NAME=VALUE]</record-reader-mapping>
...
<env>[NAME=VALUE]</env>
...
</streaming>
[映射程序]
[还原程序]
[记录阅读器类]
[名称=值]
...
[名称=值]
...
我将做这项工作
更新:是的,它有:
<streaming>
<mapper>python smspipelineHadoop.py</mapper>
<env>TEXT_DIR=cachetextdir</env>
</streaming>
python smspipelineHadoop.py
TEXT_DIR=cachetextdir