Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/sql-server/22.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop Python Oozie Shell操作无法加载文件_Hadoop_Oozie - Fatal编程技术网

Hadoop Python Oozie Shell操作无法加载文件

Hadoop Python Oozie Shell操作无法加载文件,hadoop,oozie,Hadoop,Oozie,继续前面的问题 我有一个Oozie工作流,其中包含一个调用Python脚本的shell操作,该脚本由于以下错误而失败 IOError: [Errno 13] Permission denied: '/home/test/myfile.txt' Python脚本(hello.py)试图做的就是打开一个文件。当在Hadoop之外执行时,此代码工作正常 if __name__ == '__main__': print ('Starting script') filein = '/h

继续前面的问题

我有一个Oozie工作流,其中包含一个调用Python脚本的shell操作,该脚本由于以下错误而失败

IOError: [Errno 13] Permission denied: '/home/test/myfile.txt'
Python脚本(hello.py)试图做的就是打开一个文件。当在Hadoop之外执行时,此代码工作正常

if __name__ == '__main__':
    print ('Starting script')

    filein = '/home/test/myfile.txt'

    file = open(filein, 'r')
这是我的Oozie工作流程

<workflow-app xmlns="uri:oozie:workflow:0.4" name="hello">
    <start to="shell-check-hour" />
    <action name="shell-check-hour">
        <shell xmlns="uri:oozie:shell-action:0.2">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                    <name>mapred.job.queue.name</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
            <exec>hello.py</exec>
            <file>hdfs://localhost:8020/user/test/hello.py</file>
            <capture-output />
        </shell>
        <ok to="end" />
        <error to="fail" />
    </action>
    <kill name="fail">
        <message>Workflow failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end" />
</workflow-app>
如果我只尝试文件名,我会发现找不到文件。我不明白这一点,Python脚本和文件在同一个HDFS位置

filein = 'myfile.txt'

也许我需要修改我的Oozie脚本以将文件也添加为参数?

结果表明,我需要对Python脚本稍作修改,使其能够从HDFS打开文件。下面是打开和读取文件的示例代码

import subprocess

''''''

cat = subprocess.Popen(["hadoop", "fs", "-cat", "/user/test/myfile.txt"], stdout=subprocess.PIPE)
    for line in cat.stdout:
        print line

我知道这是旧的,但我想说谢谢回来,并张贴解决方案。它工作得很好!
import subprocess

''''''

cat = subprocess.Popen(["hadoop", "fs", "-cat", "/user/test/myfile.txt"], stdout=subprocess.PIPE)
    for line in cat.stdout:
        print line