Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/276.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 我以前使用的map reduce不';我不再工作了_Python_Hadoop_Mapreduce_Bigdata - Fatal编程技术网

Python 我以前使用的map reduce不';我不再工作了

Python 我以前使用的map reduce不';我不再工作了,python,hadoop,mapreduce,bigdata,Python,Hadoop,Mapreduce,Bigdata,我使用python中的hadoop streaming编写了一个map reduce程序,该程序用于udacity training虚拟机。要运行hadoop streaming命令,他们有一个alias=hs mapper reducer输入输出。。。。它工作得很好 现在我切换到cloudera训练虚拟机,尝试使用实际的流式命令运行完全相同的map reduce,但失败了。我做错什么了吗 我使用的流式命令是 hadoop jar /usr/lib/hadoop-mapreduce/hadoop

我使用python中的hadoop streaming编写了一个map reduce程序,该程序用于udacity training虚拟机。要运行hadoop streaming命令,他们有一个
alias=hs mapper reducer输入输出
。。。。它工作得很好 现在我切换到cloudera训练虚拟机,尝试使用实际的流式命令运行完全相同的map reduce,但失败了。我做错什么了吗

我使用的流式命令是

hadoop jar /usr/lib/hadoop-mapreduce/hadoop-streaming-2.6.0-cdh5.7.0.jar  -input test  -output eout  -mapper "matest1.py" -file matest1.py  -reducer "retest2.py" -file retest2.py
有什么解决办法吗

编辑此选项以消除输出错误:

16/06/11 13:25:50 INFO mapreduce.Job: Task Id : attempt_1465622696533_0007_m_000001_2, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

16/06/11 13:26:11 INFO mapreduce.Job:  map 100% reduce 100%
16/06/11 13:26:12 INFO mapreduce.Job: Job job_1465622696533_0007 failed with state FAILED due to: Task failed task_1465622696533_0007_m_000001
Job failed as tasks failed. failedMaps:1 failedReduces:0

16/06/11 13:26:13 INFO mapreduce.Job: Counters: 9
    Job Counters 
        Failed map tasks=8
        Launched map tasks=8
        Other local map tasks=6
        Data-local map tasks=2
        Total time spent by all maps in occupied slots (ms)=177373
        Total time spent by all reduces in occupied slots (ms)=0
        Total time spent by all map tasks (ms)=177373
        Total vcore-seconds taken by all map tasks=177373
        Total megabyte-seconds taken by all map tasks=181629952
16/06/11 13:26:13 ERROR streaming.StreamJob: Job not successful!
Streaming Command Failed!
编辑标准:

Jun 12, 2016 12:09:29 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver as a provider class
Jun 12, 2016 12:09:29 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler as a provider class
Jun 12, 2016 12:09:29 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices as a root resource class
Jun 12, 2016 12:09:29 PM com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
INFO: Initiating Jersey application, version 'Jersey: 1.9 09/02/2011 11:17 AM'
Jun 12, 2016 12:09:29 PM com.google.inject.servlet.InternalServletModule$BackwardsCompatibleServletContextProvider get
WARNING: You are attempting to use a deprecated API (specifically, attempting to @Inject ServletContext inside an eagerly created singleton. While we allow this for backwards compatibility, be warned that this MAY have unexpected behavior if you have more than one injector (with ServletModule) running in the same JVM. Please consult the Guice documentation at http://code.google.com/p/google-guice/wiki/Servlets for more information.
Jun 12, 2016 12:09:30 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver to GuiceManagedComponentProvider with the scope "Singleton"
Jun 12, 2016 12:09:32 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to GuiceManagedComponentProvider with the scope "Singleton"
Jun 12, 2016 12:09:34 PM com.google.inject.servlet.InternalServletModule$BackwardsCompatibleServletContextProvider get
WARNING: You are attempting to use a deprecated API (specifically, attempting to @Inject ServletContext inside an eagerly created singleton. While we allow this for backwards compatibility, be warned that this MAY have unexpected behavior if you have more than one injector (with ServletModule) running in the same JVM. Please consult the Guice documentation at http://code.google.com/p/google-guice/wiki/Servlets for more information.
Jun 12, 2016 12:09:35 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices to GuiceManagedComponentProvider with the scope "PerRequest"
编辑系统日志:

2016-06-12 12:12:31,459 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1465711165129_0004_m_000001_3: Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

2016-06-12 12:12:31,460 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1465711165129_0004_m_000001_3 TaskAttempt Transitioned from RUNNING to FAIL_FINISHING_CONTAINER
2016-06-12 12:12:31,460 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1465711165129_0004_m_000000_3: Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

2016-06-12 12:12:31,461 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1465711165129_0004_m_000000_3 TaskAttempt Transitioned from RUNNING to FAIL_FINISHING_CONTAINER
2016-06-12 12:12:31,749 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1465711165129_0004_m_000001 Task Transitioned from RUNNING to FAILED
2016-06-12 12:12:31,749 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1465711165129_0004_m_000000 Task Transitioned from RUNNING to FAILED
2016-06-12 12:12:31,750 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 1
2016-06-12 12:12:31,780 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Job failed as tasks failed. failedMaps:1 failedReduces:0
2016-06-12 12:12:31,792 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1465711165129_0004Job Transitioned from RUNNING to FAIL_WAIT
2016-06-12 12:12:31,793 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1465711165129_0004_r_000000 Task Transitioned from SCHEDULED to KILL_WAIT
2016-06-12 12:12:31,793 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1465711165129_0004_r_000000_0 TaskAttempt Transitioned from UNASSIGNED to KILLED
2016-06-12 12:12:31,793 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1465711165129_0004_r_000000 Task Transitioned from KILL_WAIT to KILLED
2016-06-12 12:12:31,794 INFO [Thread-52] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Processing the event EventType: CONTAINER_DEALLOCATE
2016-06-12 12:12:31,796 ERROR [Thread-52] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Could not deallocate container for task attemptId attempt_1465711165129_0004_r_000000_0
2016-06-12 12:12:32,015 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1465711165129_0004Job Transitioned from FAIL_WAIT to FAIL_ABORT
2016-06-12 12:12:32,019 INFO [CommitterEvent Processor #1] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_ABORT
2016-06-12 12:12:32,071 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Before Scheduling: PendingReds:1 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:2 AssignedReds:0 CompletedMaps:1 CompletedReds:0 ContAlloc:12 ContRel:4 HostLocal:2 RackLocal:0
2016-06-12 12:12:32,074 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Recalculating schedule, headroom=<memory:4096, vCores:5>
2016-06-12 12:12:32,074 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Reduce slow start threshold reached. Scheduling reduces.
2016-06-12 12:12:32,074 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: All maps assigned. Ramping up all remaining reduces:1
2016-06-12 12:12:32,074 INFO [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: After Scheduling: PendingReds:0 ScheduledMaps:0 ScheduledReds:1 AssignedMaps:2 AssignedReds:0 CompletedMaps:1 CompletedReds:0 ContAlloc:12 ContRel:4 HostLocal:2 RackLocal:0
2016-06-12 12:12:32,194 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1465711165129_0004Job Transitioned from FAIL_ABORT to FAILED
2016-06-12 12:12:31459信息[AsyncDispatcher事件处理程序]org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:来自尝试的诊断报告\u 1465711165129\u 0004\u m_000001\u 3:错误:java.lang.RuntimeException:PipeMapRed.waitOutputThreads():子进程失败,代码为1
位于org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325)
位于org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538)
位于org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
位于org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
位于org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
位于org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
位于org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
位于org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
位于java.security.AccessController.doPrivileged(本机方法)
位于javax.security.auth.Subject.doAs(Subject.java:415)
位于org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
位于org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
2016-06-12 12:12:31460信息[AsyncDispatcher事件处理程序]org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:trunt_1465711165129_0004_m_000001_3 tasktrunt从运行转换为失败
2016-06-12 12:12:31460信息[AsyncDispatcher事件处理程序]org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:来自尝试的诊断报告\u 1465711165129\u 0004\u m\u000000\u 3:错误:java.lang.RuntimeException:PipeMapRed.waitOutputThreads():子进程失败,代码为1
位于org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325)
位于org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538)
位于org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
位于org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
位于org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
位于org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
位于org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
位于org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
位于java.security.AccessController.doPrivileged(本机方法)
位于javax.security.auth.Subject.doAs(Subject.java:415)
位于org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
位于org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
2016-06-12 12:12:31461信息[AsyncDispatcher事件处理程序]org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:trunt_1465711165129_0004_m_000000_3 tasktrunt从运行转换为失败
2016-06-12 12:12:31749信息[AsyncDispatcher事件处理程序]org.apache.hadoop.mapreduce.v2.app.job.impl.taskinpl:task_1465711165129_0004_m_000001任务从运行转换为失败
2016-06-12 12:12:31749信息[AsyncDispatcher事件处理程序]org.apache.hadoop.mapreduce.v2.app.job.impl.taskinpl:task_1465711165129_0004_m_000000任务从运行转换为失败
2016-06-12 12:12:31750信息[AsyncDispatcher事件处理程序]org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl:Num已完成任务数:1
2016-06-12 12:12:31780信息[AsyncDispatcher事件处理程序]org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl:由于任务失败,作业失败。failedMaps:1 failedReduces:0
2016-06-12 12:12:31792信息[AsyncDispatcher事件处理程序]org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl:job_1465711165129_0004; job从运行转换为失败_WAIT
2016-06-12 12:12:31793信息[AsyncDispatcher事件处理程序]org.apache.hadoop.mapreduce.v2.app.job.impl.taskinpl:task_1465711165129_0004_r_000000任务从计划任务转换为KILL_WAIT
2016-06-12 12:12:31793信息[AsyncDispatcher事件处理程序]org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl:trunt_1465711165129_0004_r_000000_0 tasktrunt从未分配转换为已终止
2016-06-12 12:12:31793信息[AsyncDispatcher事件处理程序]org.apache.hadoop.mapreduce.v2.app.job.impl.taskinpl:task_1465711165129_0004_r_000000任务从KILL_WAIT转换为KILL
2016-06-12 12:12:31794信息[Thread-52]org.apache.hadoop.mapreduce.v2.app.rm.rmContainerLocator:处理事件EventType:CONTAINER\u DEALLOCATE
2016-06-12 12:12:31796错误[Thread-52]org.apache.hadoop.mapreduce.v2.app.rm.rmContainerLocator:无法为任务尝试解除分配容器
2016-06-12 12:12:32015信息[AsyncDispatcher事件处理程序]org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl:job_1465711165129_0004作业从失败转换为等待失败\u中止
2016-06-12 12:12:32019信息[CommitterEventProcessor#1]org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler:处理事件类型:作业中止
2016-06-12 12:12:32071信息[RMCommunicator Allocator]org.apache.hadoop.mapreduce.v2.app.rm.rmContainerLocator:调度前:PendingReds:1调度地图:0调度地图:0调度地图:0分配地图:2分配地图:0完整地图:1完整地图:0控制地图:4主机本地:2机架本地:0
2016-06-12 12:12:32074信息[RMCommunicator Allocator]org.apache.hadoop.mapreduce.v2.app.rm.rmContainerLocator:重新计算计划,他