Hadoop Flume:java.io.IOException:不是数据文件

Hadoop Flume:java.io.IOException:不是数据文件,hadoop,ioexception,flume,Hadoop,Ioexception,Flume,今晚我们遇到了磁盘空间已满的问题,今天我在Flume日志中收到了以下错误: 22 Feb 2017 10:24:56,180 ERROR [pool-6-thread-1] (org.apache.flume.client.avro.ReliableSpoolingFileEventReader.openFile:504) - Exception opening file: /.../flume_spool/data.../data_2017-02-21_17-15-00_8189 java.

今晚我们遇到了磁盘空间已满的问题,今天我在Flume日志中收到了以下错误:

22 Feb 2017 10:24:56,180 ERROR [pool-6-thread-1] (org.apache.flume.client.avro.ReliableSpoolingFileEventReader.openFile:504)  - Exception opening file: /.../flume_spool/data.../data_2017-02-21_17-15-00_8189
java.io.IOException: Not a data file.
        at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
        at org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:97)
        at org.apache.avro.file.DataFileWriter.appendTo(DataFileWriter.java:160)
        at org.apache.avro.file.DataFileWriter.appendTo(DataFileWriter.java:149)
        at org.apache.flume.serialization.DurablePositionTracker.<init>(DurablePositionTracker.java:141)
        at org.apache.flume.serialization.DurablePositionTracker.getInstance(DurablePositionTracker.java:76)
        at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.openFile(ReliableSpoolingFileEventReader.java:478)
        at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.getNextFile(ReliableSpoolingFileEventReader.java:459)
        at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:229)
        at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:227)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
2017年2月22日10:24:56180错误[pool-6-thread-1](org.apache.flume.client.avro.ReliableSpoolingFileEventReader.openFile:504)-异常打开文件:/…/flume\u spool/data…/data\u 2017-02-21\u 17-15-00\u 8189
java.io.IOException:不是数据文件。
位于org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
位于org.apache.avro.file.DataFileReader(DataFileReader.java:97)
位于org.apache.avro.file.DataFileWriter.appendTo(DataFileWriter.java:160)
位于org.apache.avro.file.DataFileWriter.appendTo(DataFileWriter.java:149)
位于org.apache.flume.serialization.DurablePositionTracker.(DurablePositionTracker.java:141)
位于org.apache.flume.serialization.DurablePositionTracker.getInstance(DurablePositionTracker.java:76)
位于org.apache.flume.client.avro.ReliableSpoolingFileEventReader.openFile(ReliableSpoolingFileEventReader.java:478)
位于org.apache.flume.client.avro.ReliableSpoolingFileEventReader.getNextFile(ReliableSpoolingFileEventReader.java:459)
位于org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:229)
位于org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:227)
位于java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
位于java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
位于java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
位于java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
位于java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
位于java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
运行(Thread.java:745)

Flume版本:1.5.2

java.io.IOException:
不是数据文件
异常是由于存在一个临时目录,该目录保存元数据以供处理

该目录由flume.conf中的spooldir源定义中的trackerDir指令控制(默认情况下,spooldir中的.flumespool)

我们最终得到了空的元数据文件,这些文件没有avro(我们使用的是avro接收器)期望看到的2个字节。实际上,实际的数据文件没有任何错误,只有元数据文件

因此,解决方案是删除.flumespool,问题自行解决(当然,在从磁盘释放一点空间之后)

  • 进入您的spool文件夹:
    /…/flume\u spool/data…
  • cmd:
    find-f类-空
  • 我想你会发现:
    .flumespool/.flumespool main.meta
  • 然后
    rm.flumespool/.flumespool main.meta

  • 看起来flume版本有问题检查此链接可能会对您有所帮助此问题已在flume 1.6中修复