Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop 配置单元-具有空十进制值的ORC读取问题-java.io.EOFException:读取超过EOF的BigInteger_Hadoop_Hive_Hdfs_Orc - Fatal编程技术网

Hadoop 配置单元-具有空十进制值的ORC读取问题-java.io.EOFException:读取超过EOF的BigInteger

Hadoop 配置单元-具有空十进制值的ORC读取问题-java.io.EOFException:读取超过EOF的BigInteger,hadoop,hive,hdfs,orc,Hadoop,Hive,Hdfs,Orc,在加载ORC外部表时,在定义为DECIMAL(31,8)的列中包含空值,我在配置单元中遇到了一个问题。看起来hive在加载后无法读取ORC文件,并且无法再查看该字段中为空的记录。可以很好地读取同一ORC文件中的其他记录 这只是最近才发生的,我们没有对配置单元版本进行任何更改。令人惊讶的是,以前加载到同一个表中的ORC文件在十进制字段中为空,因此可以毫无疑问地进行查询 我们正在使用Hive1.2.1。下面是HIVE吐出的完整堆栈跟踪,我已将实际hdfs位置替换为 org.apache.hive.s

在加载ORC外部表时,在定义为DECIMAL(31,8)的列中包含空值,我在配置单元中遇到了一个问题。看起来hive在加载后无法读取ORC文件,并且无法再查看该字段中为空的记录。可以很好地读取同一ORC文件中的其他记录

这只是最近才发生的,我们没有对配置单元版本进行任何更改。令人惊讶的是,以前加载到同一个表中的ORC文件在十进制字段中为空,因此可以毫无疑问地进行查询

我们正在使用Hive1.2.1。下面是HIVE吐出的完整堆栈跟踪,我已将实际hdfs位置替换为

org.apache.hive.service.cli.HiveSQLException: java.io.IOException: java.io.IOException: Error reading file: <hdfs location>
        at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:352)
        at org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:220)
        at org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:685)
        at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:454)
        at org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:672)
        at org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553)
        at org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
        at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: java.io.IOException: Error reading file: <hdfs location>
        at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507)
        at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
        at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140)
        at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1670)
        at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:347)
        ... 13 more
Caused by: java.io.IOException: Error reading file: <hdfs location>
        at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1051)
        at org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger$OriginalReaderPair.next(OrcRawRecordMerger.java:263)
        at org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.next(OrcRawRecordMerger.java:547)
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$1.next(OrcInputFormat.java:1235)
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$1.next(OrcInputFormat.java:1219)
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$NullKeyRecordReader.next(OrcInputFormat.java:1151)
        at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$NullKeyRecordReader.next(OrcInputFormat.java:1137)
        at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:474)
        ... 17 more
Caused by: java.io.EOFException: Reading BigInteger past EOF from compressed stream Stream for column 6 kind DATA position: 201 length: 201 range: 0 offset: 289 limit: 289 range 0 = 0 to 201 uncompressed: 362 to 362
        at org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readBigInteger(SerializationUtils.java:176)
        at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$DecimalTreeReader.next(TreeReaderFactory.java:1264)
        at org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.next(TreeReaderFactory.java:2004)
        at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1044)
        ... 24 more
org.apache.hive.service.cli.HiveSQLException:java.io.IOException:java.io.IOException:读取文件时出错:
位于org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:352)
位于org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:220)
位于org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:685)
位于org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:454)
位于org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:672)
位于org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553)
位于org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538)
位于org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
位于org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
位于org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
位于org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
位于java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
位于java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
运行(Thread.java:745)
原因:java.io.IOException:java.io.IOException:读取文件时出错:
位于org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:507)
位于org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:414)
位于org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:140)
位于org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1670)
位于org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:347)
... 还有13个
原因:java.io.IOException:读取文件时出错:
位于org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1051)
位于org.apache.hadoop.hive.ql.io.orc.orcrawrecordmerge$OriginalReaderPair.next(orcrawrecordmerge.java:263)
位于org.apache.hadoop.hive.ql.io.orc.orcrawrecordmerge.next(orcrawrecordmerge.java:547)
位于org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$1.next(OrcInputFormat.java:1235)
位于org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$1.next(OrcInputFormat.java:1219)
位于org.apache.hadoop.hive.ql.io.orc.orInputFormat$NullKeyRecordReader.next(orInputFormat.java:1151)
位于org.apache.hadoop.hive.ql.io.orc.orInputFormat$NullKeyRecordReader.next(orInputFormat.java:1137)
位于org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:474)
... 还有17个
原因:java.io.EOFException:从压缩流中读取超过EOF的BigInteger,用于列6种类数据位置:201长度:201范围:0偏移量:289限制:289范围0=0到201未压缩:362到362
位于org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readbiginger(SerializationUtils.java:176)
位于org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$decimaltreerereader.next(TreeReaderFactory.java:1264)
位于org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.next(TreeReaderFactory.java:2004)
位于org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:1044)
... 还有24个

在您的代码
配置单元中设置此设置。fetch.task.conversion=none
您的问题需要格式化。@Sidney我也遇到了此问题。你最终找到解决办法了吗?