在python中调用java时,输出在stderr中

在python中调用java时,输出在stderr中,python,hadoop,Python,Hadoop,我正在用python启动一个MapReduce作业,使用代码[1]。问题是我在stderr字段[3]中获得了正确的输出数据,而不是在stdout字段[2]中。为什么我在stderr字段中获得正确的数据?我是否正确使用了Popen.communication?有没有更好的方法使用python(而不是jython)启动java执行 [1] 我用来在Hadoop中启动作业的代码片段 command=/home/xubuntu/Programs/hadoop/bin/hadoop jar /home/x

我正在用python启动一个MapReduce作业,使用代码[1]。问题是我在stderr字段[3]中获得了正确的输出数据,而不是在stdout字段[2]中。为什么我在stderr字段中获得正确的数据?我是否正确使用了
Popen.communication
?有没有更好的方法使用python(而不是jython)启动java执行

[1] 我用来在Hadoop中启动作业的代码片段

command=/home/xubuntu/Programs/hadoop/bin/hadoop jar /home/xubuntu/Programs/hadoop/medusa-java.jar mywordcount -Dfile.path=/home/xubuntu/Programs/medusa-2.0/temp/1443004585/job.attributes /input1 /output1

try:
    process = subprocess.Popen(shlex.split(command), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    out,err = process.communicate()
    print ("Out %s" % out)
    print ("Error %s" % err)

    if len(err) > 0:  # there is an exception
        # print("Going to launch exception")
        raise ValueError("Exception:\n" + err)
except ValueError as e:
    return e.message

return out
[2] stdoutdata中的输出:

[2015-09-23 07:16:13,220: WARNING/Worker-17] Out My Setup
My get job name
My get job name
My get job name
org.apache.hadoop.mapreduce.lib.partition.HashPartitioner
---> Job 0: /input1, : /output1-1443006949
10.10.5.192
10.10.5.192:8032
[3] 标准数据字段中的输出:

[2015-09-23 07:16:13,221: WARNING/Worker-17] Error 15/09/23 07:15:52 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/09/23 07:15:53 INFO client.RMProxy: Connecting to ResourceManager at  /10.10.5.192:8032
15/09/23 07:15:54 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
15/09/23 07:15:54 INFO input.FileInputFormat: Total input paths to process : 4
15/09/23 07:15:54 INFO mapreduce.JobSubmitter: number of splits:4
15/09/23 07:15:54 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1442999930174_0009
15/09/23 07:15:54 INFO impl.YarnClientImpl: Submitted application application_1442999930174_0009
15/09/23 07:15:54 INFO mapreduce.Job: The url to track the job: http://hadoop-coc-1:9046/proxy/application_1442999930174_0009/
15/09/23 07:15:54 INFO mapreduce.Job: Running job: job_1442999930174_0009
15/09/23 07:16:00 INFO mapreduce.Job: Job job_1442999930174_0009 running in uber mode : false
15/09/23 07:16:00 INFO mapreduce.Job:  map 0% reduce 0%
15/09/23 07:16:13 INFO mapreduce.Job:  map 100% reduce 0%
15/09/23 07:16:13 INFO mapreduce.Job: Job job_1442999930174_0009 completed successfully
15/09/23 07:16:13 INFO mapreduce.Job: Counters: 30
    File System Counters
            FILE: Number of bytes read=0
            FILE: Number of bytes written=423900
            FILE: Number of read operations=0
            FILE: Number of large read operations=0
            FILE: Number of write operations=0
            HDFS: Number of bytes read=472
            HDFS: Number of bytes written=148
            HDFS: Number of read operations=20
            HDFS: Number of large read operations=0
            HDFS: Number of write operations=8
    Job Counters 
            Launched map tasks=4
            Data-local map tasks=4
            Total time spent by all maps in occupied slots (ms)=41232
            Total time spent by all reduces in occupied slots (ms)=0
            Total time spent by all map tasks (ms)=41232
            Total vcore-seconds taken by all map tasks=41232
            Total megabyte-seconds taken by all map tasks=42221568
    Map-Reduce Framework
            Map input records=34
            Map output records=34
            Input split bytes=406
            Spilled Records=0
            Failed Shuffles=0
            Merged Map outputs=0
            GC time elapsed (ms)=532
            CPU time spent (ms)=1320
            Physical memory (bytes) snapshot=245039104
            Virtual memory (bytes) snapshot=1272741888
            Total committed heap usage (bytes)=65273856
    File Input Format Counters 
Hadoop(特别是
Log4j
)只是将所有
[INFO]
消息记录到
stderr
。根据其配置:

默认情况下,Hadoop将消息记录到Log4j。Log4j是通过类路径上的Log4j.properties配置的。该文件定义了记录的内容和位置。对于应用程序,默认的根记录器是“INFO,console”,它将INFO及以上级别的所有消息记录到控制台的stderr中。服务器登录到“INFO,DRFA”,该日志记录到每天滚动的文件。日志文件名为$HADOOP_Log_DIR/HADOOP-$HADOOP_IDENT_STRING-.Log

我从来没有尝试过将日志重定向到
stdout
,因此我实在帮不上忙,另一位用户建议:

// Answer by Rajkumar Singh
// to get your stdout and log message on the console you can use apache
// commons logging framework in to your mapper and reducer.

public class MyMapper extends Mapper<..,...,..,...>{
public static final Log log = LogFactory.getLog(MyMapper.class)
public void map() throws Exception{
// Log to stdout file
System.out.println("Map key "+ key);

//log to the syslog file
log.info("Map key "+ key);

if(log.isDebugEanbled()){
log.debug("Map key "+ key);
}
context.write(key,value);
}
//Rajkumar Singh的回答
//要在控制台上获取标准输出和日志消息,可以使用apache
//将commons日志框架添加到映射器和reducer中。
公共类MyMapper扩展了Mapper{
公共静态最终日志日志=LogFactory.getLog(MyMapper.class)
public void map()引发异常{
//记录到标准输出文件
System.out.println(“映射键”+键);
//登录到syslog文件
日志信息(“映射键”+键);
if(log.isDebugEanbled()){
log.debug(“映射键”+键);
}
编写(键、值);
}

我建议试一试。

我很困惑。我只在你的标准调试信息中看到hadoop使用的是
System.err.println