Hadoop 作业作业失败,状态为失败,原因是:由于ApplicationMaster尝试appattempt超时,应用程序应用程序失败2次。

Hadoop 作业作业失败,状态为失败,原因是:由于ApplicationMaster尝试appattempt超时,应用程序应用程序失败2次。,hadoop,Hadoop,我向运行Hadoop 2.7.1的集群提交了一个作业。“jps”在主服务器和从服务器中都可以。“hdfs dfsadmin-report”很有趣,但当我运行任何grep或wordcount时,它都是错误的。即使是很小的输入文件,它也会保留半到一个小时,然后由于以下错误而失败 15/12/09 08:42:55 INFO impl.YarnClientImpl: Submitted application application_1449645631518_0003 15/12/09 08:42:

我向运行Hadoop 2.7.1的集群提交了一个作业。“jps”在主服务器和从服务器中都可以。“hdfs dfsadmin-report”很有趣,但当我运行任何grep或wordcount时,它都是错误的。即使是很小的输入文件,它也会保留半到一个小时,然后由于以下错误而失败

15/12/09 08:42:55 INFO impl.YarnClientImpl: Submitted application application_1449645631518_0003
15/12/09 08:42:55 INFO mapreduce.Job: The url to track the job: http://Master:8088/proxy/application_1449645631518_0003/
15/12/09 08:42:55 INFO mapreduce.Job: Running job: job_1449645631518_0003
15/12/09 09:07:12 INFO mapreduce.Job: Job job_1449645631518_0003 running in uber mode : false
15/12/09 09:07:12 INFO mapreduce.Job:  map 0% reduce 0%
15/12/09 09:07:12 INFO mapreduce.Job: Job job_1449645631518_0003 failed with state FAILED due to: Application application_1449645631518_0003 failed 2 times due to ApplicationMaster for attempt appattempt_1449645631518_0003_000002 timed out. Failing the application.
15/12/09 09:07:12 INFO mapreduce.Job: Counters: 0
15/12/09 09:07:13 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/12/09 09:07:13 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/hadoop/.staging/job_1449645631518_0004
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://Master:9000/user/hadoop/grep-temp-105897268
    at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:323)
    at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:265)
    at org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:59)
    at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:387)
    at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:301)
    at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:318)
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:196)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
    at org.apache.hadoop.examples.Grep.run(Grep.java:94)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.examples.Grep.main(Grep.java:103)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
    at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
    at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
这是ResourceManager日志:

2015-12-09 12:37:11,661 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: The number of failed attempts is 2. The max attempts is 2
2015-12-09 12:37:11,661 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Updating application application_1449645631518_0005 with final state: FAILED
2015-12-09 12:37:11,661 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Updating info for app: application_1449645631518_0005
2015-12-09 12:37:11,661 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1449645631518_0005 State change from ACCEPTED to FINAL_SAVING
2015-12-09 12:37:11,661 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application Attempt appattempt_1449645631518_0005_000002 is done. finalState=FAILED
2015-12-09 12:37:11,662 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1449645631518_0005_02_000001 Container Transitioned from RUNNING to KILLED
2015-12-09 12:37:11,662 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp: Completed container: container_1449645631518_0005_02_000001 in state: KILLED event:KILL
2015-12-09 12:37:11,662 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop   OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application_1449645631518_0005 CONTAINERID=container_1449645631518_0005_02_000001
2015-12-09 12:37:11,662 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Released container container_1449645631518_0005_02_000001 of capacity <memory:2048, vCores:1> on host Slave2:48352, which currently has 0 containers, <memory:0, vCores:0> used and <memory:8192, vCores:8> available, release resources=true
2015-12-09 12:37:11,662 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: default used=<memory:0, vCores:0> numContainers=0 user=hadoop user-resources=<memory:0, vCores:0>
2015-12-09 12:37:11,662 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: completedContainer container=Container: [ContainerId: container_1449645631518_0005_02_000001, NodeId: Slave2:48352, NodeHttpAddress: Slave2:8042, Resource: <memory:2048, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 11.11.1.3:48352 }, ] queue=default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0 cluster=<memory:16384, vCores:16>
2015-12-09 12:37:11,663 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: completedContainer queue=root usedCapacity=0.0 absoluteUsedCapacity=0.0 used=<memory:0, vCores:0> cluster=<memory:16384, vCores:16>
2015-12-09 12:37:11,663 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Re-sorting completed queue: root.default stats: default: capacity=1.0, absoluteCapacity=1.0, usedResources=<memory:0, vCores:0>, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=1, numContainers=0
2015-12-09 12:37:11,663 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application attempt appattempt_1449645631518_0005_000002 released container container_1449645631518_0005_02_000001 on node: host: Slave2:48352 #containers=0 available=<memory:8192, vCores:8> used=<memory:0, vCores:0> with event: KILL
2015-12-09 12:37:11,663 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1449645631518_0005 requests cleared
2015-12-09 12:37:11,663 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application removed - appId: application_1449645631518_0005 user: hadoop queue: default #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0
2015-12-09 12:37:11,663 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Application application_1449645631518_0005 failed 2 times due to ApplicationMaster for attempt appattempt_1449645631518_0005_000002 timed out. Failing the application.
2015-12-09 12:37:11,667 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1449645631518_0005 State change from FINAL_SAVING to FAILED
2015-12-09 12:37:11,667 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Application removed - appId: application_1449645631518_0005 user: hadoop leaf-queue of parent: root #applications: 0
2015-12-09 12:37:11,667 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=hadoop   OPERATION=Application Finished - Failed TARGET=RMAppManager RESULT=FAILURE  DESCRIPTION=App failed with state: FAILED   PERMISSIONS=Application application_1449645631518_0005 failed 2 times due to ApplicationMaster for attempt appattempt_1449645631518_0005_000002 timed out. Failing the application. APPID=application_1449645631518_0005
2015-12-09 12:37:11,668 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary: appId=application_1449645631518_0005,name=grep-search,user=hadoop,queue=default,state=FAILED,trackingUrl=http://Master:8088/cluster/app/application_1449645631518_0005,appMasterHost=N/A,startTime=1449663079331,finishTime=1449664631661,finalStatus=FAILED,memorySeconds=3177991,vcoreSeconds=1550,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=<memory:0\, vCores:0>,applicationType=MAPREDUCE
2015-12-09 12:37:11,668 INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Cleaning master appattempt_1449645631518_0005_000002
2015-12-09 12:37:12,366 INFO org.apache.hadoop.yarn.server.resourcemanager.ClientRMService: Allocated new applicationId: 6
2015-12-09 12:37:12,710 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Null container completed...
2015-12-09 12:37:12,711 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Null container completed...
2015-12-09 12:37:11661 INFO org.apache.hadoop.warn.server.resourcemanager.rmapp.RMAppImpl:失败的尝试次数为2次。最大尝试次数为2次
2015-12-09 12:37:11661 INFO org.apache.hadoop.warn.server.resourcemanager.rmapp.RMAppImpl:更新应用程序_1449645631518_0005,最终状态:失败
2015-12-09 12:37:11661 INFO org.apache.hadoop.warn.server.resourcemanager.recovery.RMStateStore:更新应用程序的信息:应用程序_1449645631518_0005
2015-12-09 12:37:11661 INFO org.apache.hadoop.warn.server.resourcemanager.rmapp.RMAppImpl:application_1449645631518_0005状态从已接受更改为最终保存
2015-12-09 12:37:11661 INFO org.apache.hadoop.warn.server.resourcemanager.scheduler.capacity.CapacityScheduler:应用程序尝试appattempt_1449645631518_0005_000002已完成。finalState=失败
2015-12-09 12:37:11662 INFO org.apache.hadoop.warn.server.resourcemanager.rmcontainer.RMContainerImpl:container_1449645631518_0005_02_000001容器从正在运行转换为已终止
2015-12-09 12:37:11662 INFO org.apache.hadoop.warn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp:已完成的容器:容器状态:KILL事件:KILL
2015-12-09 12:37:11662 INFO org.apache.hadoop.warn.server.resourcemanager.RMAuditLogger:USER=hadoop OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS APPID=application\u 1449645631518\u 0005 CONTAINERID=Container\u 1449645631518\u 0005\u 02\u000001
2015-12-09 12:37:11662 INFO org.apache.hadoop.warn.server.resourcemanager.SchedulerNode:Released container container_1449645631518_0005_02_000001主机Slave2:48352上当前有0个已使用和可用的容器,release resources=true
2015-12-09 12:37:11662 INFO org.apache.hadoop.warn.server.resourcemanager.scheduler.capacity.LeafQueue:default used=numContainers=0 user=hadoop用户资源=
2015-12-09 12:37:11662 INFO org.apache.hadoop.warn.server.resourcemanager.scheduler.capacity.LeafQueue:completedContainer container=container:[ContainerId:container_14496456315;_0005_02_000001,NodeId:Slave2:48352,NodeHttpAddress:Slave2:8042,资源:,优先级:0,令牌{种类:ContainerToken,服务:11.11.1.1.3:48352}]队列=默认值:容量=1.0,绝对容量=1.0,usedResources=,usedCapacity=0.0,绝对usedCapacity=0.0,numApps=1,numContainers=0=
2015-12-09 12:37:11663 INFO org.apache.hadoop.warn.server.resourcemanager.scheduler.capacity.ParentQueue:completedContainer queue=root usedCapacity=0.0 absoluteUsedCapacity=0.0 used=cluster=
2015-12-09 12:37:11663 INFO org.apache.hadoop.warn.server.resourcemanager.scheduler.capacity.ParentQueue:重新排序已完成的队列:root.default stats:default:capacity=1.0,absoluteCapacity=1.0,usedResources=,usedCapacity=0.0,absoluteUsedCapacity=0.0,numApps=1,numContainers=0
2015-12-09 12:37:11663 INFO org.apache.hadoop.warn.server.resourcemanager.scheduler.capacity.CapacityScheduler:应用程序尝试appattempt_1449645631518_0005_000002已发布容器容器_1449645631518_0005_02_000001在节点上:主机:Slave2:48352#容器=0可用=已使用=事件:KILL
2015-12-09 12:37:11663 INFO org.apache.hadoop.warn.server.resourcemanager.scheduler.AppSchedulingInfo:Application_1449645631518_0005请求已清除
2015-12-09 12:37:11663 INFO org.apache.hadoop.warn.server.resourcemanager.scheduler.capacity.LeafQueue:应用程序已删除-appId:Application_1449645631518_0005用户:hadoop队列:默认值#用户挂起应用程序:0#用户活动应用程序:0#挂起应用程序:0#队列活动应用程序:0
2015-12-09 12:37:11663 INFO org.apache.hadoop.warn.server.resourcemanager.rmapp.RMAppImpl:由于ApplicationMaster尝试appattempt\u 1449645631518\u 0005\u000002超时,应用程序应用程序失败2次。应用程序失败。
2015-12-09 12:37:11667 INFO org.apache.hadoop.warn.server.resourcemanager.rmapp.RMAppImpl:application_1449645631518_0005状态从最终保存更改为失败
2015-12-09 12:37:11667 INFO org.apache.hadoop.warn.server.resourcemanager.scheduler.capacity.ParentQueue:已删除应用程序-appId:应用程序_1449645631518 _0005用户:父级的hadoop叶队列:根#应用程序:0
2015-12-09 12:37:11667警告org.apache.hadoop.WARN.server.resourcemanager.RMAuditLogger:USER=hadoop OPERATION=Application Finished-Failed TARGET=RMAppManager RESULT=FAILURE DESCRIPTION=App Failed with state:Failed PERMISSIONS=Application\u 1449645631518\u 0005由于ApplicationMaster的尝试而失败了2次appattempt_1449645631518_0005_u000002超时。应用程序失败。APPID=应用程序_1449645631518_0005
2015-12-09 12:37:11668 INFO org.apache.hadoop.warn.server.resourcemanager.RMAppManager$ApplicationSummary:appId=application\u 1449645631518\u 0005,name=grep search,user=hadoop,queue=default,state=FAILED,trackingUrl=http://Master:8088/cluster/app/application_1449645631518_0005,appMasterHost=N/A,开始时间=1449663079331,完成时间=1449664631661,finalStatus=FAILED,memorySeconds=3177991,vcoreSeconds=1550,PreemptedAdamContainers=0,preemptedNonAMContainers=0,preemptedResources=,applicationType=MAPREDUCE
2015-12-09 12:37:11668 INFO org.apache.hadoop.warn.server.resourcemanager.amlauncher.amlauncher:Cleaning master appattempt\u 1449645631518\u 0005\u000002
2015-12-09 12:37:12366 INFO org.apache.hadoop.warn.server.resourcemanager.ClientRMService:分配的新应用程序ID:6
2015-12-09 12:37:12,