Hadoop 配置单元选择计数(*)失败

Hadoop 配置单元选择计数(*)失败,hadoop,mapreduce,hive,Hadoop,Mapreduce,Hive,我有一个包含JSON记录的配置单元外部表。到目前为止,我已经在表中插入了9GB的记录。当我试图从abc中选择count*时,我 获取以下错误: Hadoop job information for Stage-1: number of mappers: 34; number of reducers: 1 2017-03-10 09:02:37,777 Stage-1 map = 0%, reduce = 0% 2017-03-10 09:03:01,204 Stage-1 map = 1%,

我有一个包含JSON记录的配置单元外部表。到目前为止,我已经在表中插入了9GB的记录。当我试图从abc中选择count*时,我 获取以下错误:

Hadoop job information for Stage-1: number of mappers: 34; number of reducers: 1
2017-03-10 09:02:37,777 Stage-1 map = 0%,  reduce = 0%
2017-03-10 09:03:01,204 Stage-1 map = 1%,  reduce = 0%, Cumulative CPU 234.61 sec
2017-03-10 09:03:11,878 Stage-1 map = 11%,  reduce = 0%, Cumulative CPU 440.25 sec
2017-03-10 09:03:12,909 Stage-1 map = 2%,  reduce = 0%, Cumulative CPU 409.98 sec
2017-03-10 09:03:13,965 Stage-1 map = 3%,  reduce = 0%, Cumulative CPU 422.4 sec
2017-03-10 09:03:15,002 Stage-1 map = 7%,  reduce = 0%, Cumulative CPU 426.58 sec
2017-03-10 09:03:16,028 Stage-1 map = 6%,  reduce = 0%, Cumulative CPU 401.35 sec
2017-03-10 09:03:18,383 Stage-1 map = 20%,  reduce = 0%, Cumulative CPU 436.33 sec
2017-03-10 09:03:20,436 Stage-1 map = 23%,  reduce = 0%, Cumulative CPU 426.7 sec
2017-03-10 09:03:21,462 Stage-1 map = 33%,  reduce = 0%, Cumulative CPU 450.36 sec
2017-03-10 09:03:22,493 Stage-1 map = 38%,  reduce = 0%, Cumulative CPU 455.93 sec
2017-03-10 09:03:23,522 Stage-1 map = 52%,  reduce = 0%, Cumulative CPU 464.36 sec
2017-03-10 09:03:26,601 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 321.17 sec
MapReduce Total cumulative CPU time: 5 minutes 21 seconds 170 msec
Ended Job = job_1489116838071_0002 with errors
Error during job, obtaining debugging information...
Job Tracking URL: http://...........................
Task with the most failures(4):
-----
Task ID:
  task_1489116838071_0002_m_000018

URL:
  http://ip-10-16-37-124:8088/taskdetails.jsp?jobid=job_1489116838071_0002&tipid=task_1489116838071_0002_m_000018
-----
Diagnostic Messages for this Task:
Exception from container-launch.
Container id: container_1489116838071_0002_01_000065
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:582)
        at org.apache.hadoop.util.Shell.run(Shell.java:479)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773)
        at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:744)


Container exited with a non-zero exit code 1


FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 34  Reduce: 1   Cumulative CPU: 321.17 sec   HDFS Read: 4112196268 HDFS Write: 0 FAIL
如果表较小,则count*可以正常工作

hadoop-user-namenode-ip-xxx.log:

mapred-user-historyserver-ip-xxx.log:

纱线-user-resourcemanager-ip-xxx.log:


我在warn-site.xml和mapred-site.xml中为vCore、映射任务、reduce任务等设置了值,但可能需要一些调整。

您是否尝试过使用set-mapreduce.map.memory.mb=xxxx;每个容器的兆字节数是多少?例如4096。类似地,对于reduce stap,您可以尝试设置mapreduce.red.memory.mb=xxxx;要检查的相关日志是纱线作业日志,也就是说,当配置单元引用作业1489116838071_0002时,您必须转到ResourceManager UI并搜索应用程序_1489116838071_0002,作业前缀是前纱线时代遗留下来的,或者只是运行纱线日志-应用程序ID应用程序_1489116838071_0002…也包括纱线应用程序-状态应用程序_1489116838071_0002,用于摘要-在某些情况下无法收集日志,例如AM无法启动时,或者当容器因超出RAM配额/被调度程序抢占而被纱线杀死-9时/etc@spijsmapreduce.map.memory.mb在mapred-site.xml中设置为3072,mapreduce.reduce.memory.mb也设置为30723072@SamsonScharfrichter纱线日志-applicationId application_1490552610386_0013给出:-17/03/30 11:52:16 INFO client.RMProxy:正在ip-10-16-37-124/10.16.37.124:8050 17/03/30 11:52:16连接到ResourceManager警告util.NativeCodeLoader:无法为您的平台加载本机hadoop库。。。在适用的地方使用内置java类/logs/platformteam/logs/application_1490552610386_0013并不存在。日志聚合尚未完成或未启用。
2017-03-10 09:03:26,997 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocate blk_1073826134_94647{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-6d5c8723-5320-401d-ac04-7cf64fc3f723:NORMAL:10.16.37.61:50010|RBW], ReplicaUC[[DISK]DS-206e72c4-9ca8-4638-83d5-549e08a1dc04:NORMAL:10.16.37.208:50010|RBW], ReplicaUC[[DISK]DS-ccf79a81-25d6-41c2-a133-83cded4ba189:NORMAL:10.16.37.32:50010|RBW]]} for /opt/history/done_intermediate/user/job_1489116838071_0002-1489136553618-user-select+count%28*%29+from+abc%28Stage%2D1%29-1489136605998-15-0-FAILED-default-1489136557174.jhist_tmp
2017-03-10 09:03:27,010 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 10.16.37.32:50010 is added to blk_1073826134_94647{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-6d5c8723-5320-401d-ac04-7cf64fc3f723:NORMAL:10.16.37.61:50010|RBW], ReplicaUC[[DISK]DS-206e72c4-9ca8-4638-83d5-549e08a1dc04:NORMAL:10.16.37.208:50010|RBW], ReplicaUC[[DISK]DS-ccf79a81-25d6-41c2-a133-83cded4ba189:NORMAL:10.16.37.32:50010|RBW]]} size 0
2017-03-10 09:03:27,011 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 10.16.37.208:50010 is added to blk_1073826134_94647{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-6d5c8723-5320-401d-ac04-7cf64fc3f723:NORMAL:10.16.37.61:50010|RBW], ReplicaUC[[DISK]DS-206e72c4-9ca8-4638-83d5-549e08a1dc04:NORMAL:10.16.37.208:50010|RBW], ReplicaUC[[DISK]DS-ccf79a81-25d6-41c2-a133-83cded4ba189:NORMAL:10.16.37.32:50010|RBW]]} size 0
2017-03-10 09:03:27,011 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap updated: 10.16.37.61:50010 is added to blk_1073826134_94647{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-6d5c8723-5320-401d-ac04-7cf64fc3f723:NORMAL:10.16.37.61:50010|RBW], ReplicaUC[[DISK]DS-206e72c4-9ca8-4638-83d5-549e08a1dc04:NORMAL:10.16.37.208:50010|RBW], ReplicaUC[[DISK]DS-ccf79a81-25d6-41c2-a133-83cded4ba189:NORMAL:10.16.37.32:50010|RBW]]} size 0
2017-03-10 09:03:27,012 INFO org.apache.hadoop.hdfs.StateChange: DIR* completeFile: /opt/history/done_intermediate/user/job_1489116838071_0002-1489136553618-user-select+count%28*%29+from+abc%28Stage%2D1%29-1489136605998-15-0-FAILED-default-1489136557174.jhist_tmp is closed by DFSClient_NONMAPREDUCE_-1618794566_1
2017-03-10 09:04:17,871 INFO org.apache.hadoop.mapreduce.jobhistory.JobSummary: jobId=job_1489116838071_0002,submitTime=1489136553618,launchTime=1489136557174,firstMapTaskLaunchTime=1489136559405,firstReduceTaskLaunchTime=1489136603997,finishTime=1489136605998,resourcesPerMap=3072,resourcesPerReduce=3072,numMaps=34,numReduces=1,user=user,queue=default,status=FAILED,mapSlotSeconds=4435,reduceSlotSeconds=2,jobName=select count(*) from abc(Stage-1)
2017-03-10 09:03:33,532 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary: appId=application_1489116838071_0002,name=select count(*) from abc(Stage-1),user=user,queue=default,state=FINISHED,trackingUrl=http://ip-10-16-37-124:8088/proxy/application_1489116838071_0002/,appMasterHost=ip-10-16-37-61,startTime=1489136553618,finishTime=1489136607062,finalStatus=FAILED,memorySeconds=5025083,vcoreSeconds=1579,preemptedAMContainers=0,preemptedNonAMContainers=0,preemptedResources=<memory:0\, vCores:0>,applicationType=MAPREDUCE
2017-03-10 09:03:33,532 INFO org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: container_1489116838071_0002_01_000069 Container Transitioned from ACQUIRED to KILLED
2017-03-10 09:03:33,532 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.common.fica.FiCaSchedulerApp: Completed container: container_1489116838071_0002_01_000069 in state: KILLED event:KILL
2017-03-10 09:03:33,532 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=user     OPERATION=AM Released Container TARGET=SchedulerApp     RESULT=SUCCESS  APPID=application_1489116838071_0002    CONTAINERID=container_1489116838071_0002_01_000069
2017-03-10 09:03:33,532 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerNode: Released container container_1489116838071_0002_01_000069 of capacity <memory:3072, vCores:1> on host ip-10-16-37-61:38344, which currently has 0 containers, <memory:0, vCores:0> used and <memory:40960, vCores:8> available, release resources=true