Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop 无法启动配置单元查询(MapReduce)_Hadoop_Cloudera_Yarn - Fatal编程技术网

Hadoop 无法启动配置单元查询(MapReduce)

Hadoop 无法启动配置单元查询(MapReduce),hadoop,cloudera,yarn,Hadoop,Cloudera,Yarn,我在配置单元查询方面有问题。如果我试图从色调界面开始计数(*)查询,但我得到这样的异常: 15/01/23 15:06:42 ERROR operation.Operation: Error running hive query: org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apach

我在配置单元查询方面有问题。如果我试图从色调界面开始计数(*)查询,但我得到这样的异常:

15/01/23 15:06:42 ERROR operation.Operation: Error running hive query: 
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
    at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:147)
    at org.apache.hive.service.cli.operation.SQLOperation.access$000(SQLOperation.java:69)
    at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:200)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
    at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:502)
    at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:213)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
通过在配置单元Cli中启动相同的查询,我得到:

hive> select count(*) from tweets; 
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:149)
    at java.lang.StringCoding.decode(StringCoding.java:193)
    at java.lang.String.<init>(String.java:416)
    at com.google.protobuf.LiteralByteString.toString(LiteralByteString.java:148)
    at com.google.protobuf.ByteString.toStringUtf8(ByteString.java:572)
    at org.apache.hadoop.hdfs.protocol.proto.HdfsProtos$ExtendedBlockProto.getPoolId(HdfsProtos.java:743)
    at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:525)
    at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:751)
    at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1188)
    at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1324)
    at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1432)
    at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1441)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:549)
    at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy17.getListing(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1906)
    at org.apache.hadoop.hdfs.DistributedFileSystem$15.<init>(DistributedFileSystem.java:742)
    at org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:731)
    at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1664)
    at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:300)
    at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:264)
    at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:217)
    at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:75)
    at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:336)
    at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:302)
    at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:435)
    at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:525)
    at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:517)
FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. GC overhead limit exceeded
hive>从tweets中选择count(*);
职位总数=1
正在启动作业1/1
编译时确定的reduce任务数:1
要更改减速器的平均负载(以字节为单位):
设置hive.exec.reducers.bytes.per.reducer=
为了限制减速器的最大数量:
设置hive.exec.reducers.max=
为了设置恒定数量的减速器:
设置mapreduce.job.reduces=
java.lang.OutOfMemoryError:超出GC开销限制
在java.lang.StringCoding$StringDecoder.decode(StringCoding.java:149)
在java.lang.StringCoding.decode(StringCoding.java:193)
位于java.lang.String。(String.java:416)
位于com.google.protobuf.LiteralByteString.toString(LiteralByteString.java:148)
位于com.google.protobuf.ByteString.toStringUtf8(ByteString.java:572)
位于org.apache.hadoop.hdfs.protocol.proto.HdfsProtos$ExtendedBlockProto.getPoolId(HdfsProtos.java:743)
位于org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:525)
位于org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:751)
位于org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1188)
位于org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1324)
位于org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1432)
位于org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1441)
位于org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:549)
位于sun.reflect.GeneratedMethodAccessor9.invoke(未知源)
在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)中
位于java.lang.reflect.Method.invoke(Method.java:606)
位于org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
位于org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
位于com.sun.proxy.$Proxy17.getListing(未知源)
位于org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1906)
位于org.apache.hadoop.hdfs.DistributedFileSystem$15。(DistributedFileSystem.java:742)
位于org.apache.hadoop.hdfs.DistributedFileSystem.listLocatedStatus(DistributedFileSystem.java:731)
位于org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1664)
位于org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:300)
位于org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:264)
位于org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:217)
位于org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:75)
位于org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:336)
在org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:302)上
位于org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:435)
在org.apache.hadoop.mapreduce.jobsmitter.writeldsplits(jobsmitter.java:525)上
位于org.apache.hadoop.mapreduce.jobsmitter.writeSplits(jobsmitter.java:517)
失败:执行错误,从org.apache.hadoop.hive.ql.exec.mr.MapRedTask返回代码-101。超出GC开销限制
我试图在jobTracker中查看日志,但我发现

cat如何解决所有这些错误

  • 集群操作系统:CentOS 6.6
  • Hadoop发行版:Cloudera CDH 5.2
  • Mapreduce:纱线

我已经理解了这个问题

select count(*) from tweets;
问题是,我在某些节点主机上将serde.jar放在了错误的目录中。因此,我在配置单元cli/Hue中得到查询错误。CDH 4.*抛出“未找到类异常”和CDH 5.*错误代码2


但是jobTracker(Thread)的问题仍然存在。

它在HiveCLI中工作而不是Beeline的原因是HiveCLI中没有强制执行用户/组安全,而as Beeline将遵守某种形式的授权:哨兵/护林员(如果安装)或HDFS级别的权限