Apache flink 项目分组不正确-CoGroupByKey CoGroupByKey问题 数据描述。

Apache flink 项目分组不正确-CoGroupByKey CoGroupByKey问题 数据描述。,apache-flink,apache-beam,Apache Flink,Apache Beam,我有两个数据集 记录-第一个,每个(键,天)包含约0.5-1M的记录。对于测试,我使用2-3个键和5-10天的数据。我射击的目标是1000多把钥匙。每个记录都包含键、时间戳(μ-秒)和一些其他数据 配置-第二个配置相当小。它及时地描述了密钥,例如,您可以将其视为元组列表:(密钥、开始日期、结束日期、描述) 对于探索,我将数据编码为长度前缀协议缓冲区二进制编码消息的文件。此外,这些文件还使用gzip打包。数据按日期分片。每个文件大约10MB 管道 我使用apachebeam来表示管道 首先,我

我有两个数据集

  • 记录-第一个,每个
    (键,天)
    包含约0.5-1M的记录。对于测试,我使用2-3个键和5-10天的数据。我射击的目标是1000多把钥匙。每个记录都包含键、时间戳(μ-秒)和一些其他数据
  • 配置-第二个配置相当小。它及时地描述了密钥,例如,您可以将其视为元组列表:
    (密钥、开始日期、结束日期、描述)
对于探索,我将数据编码为长度前缀协议缓冲区二进制编码消息的文件。此外,这些文件还使用gzip打包。数据按日期分片。每个文件大约10MB

管道 我使用apachebeam来表示管道

  • 首先,我向这两个数据集添加键。对于记录数据集,它是
    (键,日期取整时间戳)
    。对于配置,键为
    (键,天)
    ,其中天是
    开始日期
    结束日期
    (指向午夜)之间的每个时间戳值
  • 使用CoGroupByKey合并数据集
  • 作为键类型,我使用repo中的
    Tuple2Coder
    使用
    org.apache.flink.api.java.tuple.Tuple2

    问题 如果记录数据集很小,比如5天,那么一切看起来都很好(检查normal_run.log)

    当我在10天以上的时间内运行管道时,我遇到一个错误,指出某些记录没有配置(error_run.log)

    然后,我添加了一些额外的日志消息:

    (a.java:144) - 68643 items for KeyValue3 on: 1462665600000000
    (a.java:140) - no items for KeyValue3 on: 1463184000000000
    (a.java:123) - missing for KeyValue3 on: 1462924800000000
    (a.java:142) - 753707 items for KeyValue3 on: 1462924800000000 marked as no-loc
    (a.java:123) - missing for KeyValue3 on: 1462752000000000
    (a.java:142) - 749901 items for KeyValue3 on: 1462752000000000 marked as no-loc
    (a.java:144) - 754578 items for KeyValue3 on: 1462406400000000
    (a.java:144) - 751574 items for KeyValue3 on: 1463011200000000
    (a.java:123) - missing for KeyValue3 on: 1462665600000000
    (a.java:142) - 754758 items for KeyValue3 on: 1462665600000000 marked as no-loc
    (a.java:123) - missing for KeyValue3 on: 1463184000000000
    (a.java:142) - 694372 items for KeyValue3 on: 1463184000000000 marked as no-loc
    
    您可以在第一行中发现,为KeyValue3和time 1462665600000000处理了68643个项目。
    稍后在第9行中,操作似乎再次处理同一个键,但它报告没有可用于这些记录的配置。
    第10行通知他们已被标记为无loc

    第2行表示没有KeyValue3和time 1463184000000的项目,但在第11行中,您可以看到此(key,day)对的项目是稍后处理的,它们缺少配置

    一些线索 在一次探索运行期间,我遇到了一个异常(exception_抛出.log)

    解决方法(经过更多测试后,不起作用,继续使用Tuple2) 我已从使用Tuple2切换到协议缓冲区消息:

    message KeyDay {
      optional ByteString key = 1;
      optional int64 timestamp_usec = 2;
    }
    
    但是使用
    Tuple2.of()
    比使用
    KeyDay.newBuilder().setKey(…).setTimestampUsec(…).build()更简单


    当切换到一个键时,有一个从protobuf派生的类。消息问题消失了10-15天(因此,对于Tuple2来说,数据大小是个问题),但将数据大小增加到20天表明它存在。

    Related(?):问题由Flink开发人员确认:Related(?):问题由Flink开发人员确认:
    (a.java:144) - 68643 items for KeyValue3 on: 1462665600000000
    (a.java:140) - no items for KeyValue3 on: 1463184000000000
    (a.java:123) - missing for KeyValue3 on: 1462924800000000
    (a.java:142) - 753707 items for KeyValue3 on: 1462924800000000 marked as no-loc
    (a.java:123) - missing for KeyValue3 on: 1462752000000000
    (a.java:142) - 749901 items for KeyValue3 on: 1462752000000000 marked as no-loc
    (a.java:144) - 754578 items for KeyValue3 on: 1462406400000000
    (a.java:144) - 751574 items for KeyValue3 on: 1463011200000000
    (a.java:123) - missing for KeyValue3 on: 1462665600000000
    (a.java:142) - 754758 items for KeyValue3 on: 1462665600000000 marked as no-loc
    (a.java:123) - missing for KeyValue3 on: 1463184000000000
    (a.java:142) - 694372 items for KeyValue3 on: 1463184000000000 marked as no-loc
    
    05/26/2016 03:49:49 GroupReduce (GroupReduce at GroupByKey)(1/5) switched to FAILED
    java.lang.Exception: The data preparation for task 'GroupReduce (GroupReduce at GroupByKey)' , caused an error: Error obtaining the sorted input: Thread 'SortMerger spilling thread' terminated due to an exception: Error obtaining the sorted input: Thread 'SortMerger Reading Thread' terminated due to an exception: tried to access field com.esotericsoftware.kryo.io.Input.inputStream from class org.apache.flink.api.java.typeutils.runtime.NoFetchingInput
      at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:455)
      at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:345)
      at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
      at java.lang.Thread.run(Thread.java:745)
    Caused by: java.lang.RuntimeException: Error obtaining the sorted input: Thread 'SortMerger spilling thread' terminated due to an exception: Error obtaining the sorted input: Thread 'SortMerger Reading Thread' terminated due to an exception: tried to access field com.esotericsoftware.kryo.io.Input.inputStream from class org.apache.flink.api.java.typeutils.runtime.NoFetchingInput
      at org.apache.flink.runtime.operators.sort.UnilateralSortMerger.getIterator(UnilateralSortMerger.java:619)
      at org.apache.flink.runtime.operators.BatchTask.getInput(BatchTask.java:1079)
      at org.apache.flink.runtime.operators.GroupReduceDriver.prepare(GroupReduceDriver.java:94)
      at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:450)
      ... 3 more
    Caused by: java.io.IOException: Thread 'SortMerger spilling thread' terminated due to an exception: Error obtaining the sorted input: Thread 'SortMerger Reading Thread' terminated due to an exception: tried to access field com.esotericsoftware.kryo.io.Input.inputStream from class org.apache.flink.api.java.typeutils.runtime.NoFetchingInput
      at org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:799)
    Caused by: java.lang.RuntimeException: Error obtaining the sorted input: Thread 'SortMerger Reading Thread' terminated due to an exception: tried to access field com.esotericsoftware.kryo.io.Input.inputStream from class org.apache.flink.api.java.typeutils.runtime.NoFetchingInput
      at org.apache.flink.runtime.operators.sort.UnilateralSortMerger.getIterator(UnilateralSortMerger.java:619)
      at org.apache.flink.runtime.operators.sort.LargeRecordHandler.finishWriteAndSortKeys(LargeRecordHandler.java:263)
      at org.apache.flink.runtime.operators.sort.UnilateralSortMerger$SpillingThread.go(UnilateralSortMerger.java:1409)
      at org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:796)
    Caused by: java.io.IOException: Thread 'SortMerger Reading Thread' terminated due to an exception: tried to access field com.esotericsoftware.kryo.io.Input.inputStream from class org.apache.flink.api.java.typeutils.runtime.NoFetchingInput
      at org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:799)
    Caused by: java.lang.IllegalAccessError: tried to access field com.esotericsoftware.kryo.io.Input.inputStream from class org.apache.flink.api.java.typeutils.runtime.NoFetchingInput
      at org.apache.flink.api.java.typeutils.runtime.NoFetchingInput.readBytes(NoFetchingInput.java:122)
      at com.esotericsoftware.kryo.io.Input.readBytes(Input.java:297)
      at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ByteArraySerializer.read(DefaultArraySerializers.java:35)
      at com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ByteArraySerializer.read(DefaultArraySerializers.java:18)
      at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:706)
      at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:611)
      at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
      at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732)
      at org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.deserialize(KryoSerializer.java:228)
      at org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.deserialize(KryoSerializer.java:242)
      at org.apache.flink.api.java.typeutils.runtime.TupleSerializer.deserialize(TupleSerializer.java:144)
      at org.apache.flink.api.java.typeutils.runtime.TupleSerializer.deserialize(TupleSerializer.java:30)
      at org.apache.flink.api.java.typeutils.runtime.TupleSerializer.deserialize(TupleSerializer.java:144)
      at org.apache.flink.api.java.typeutils.runtime.TupleSerializer.deserialize(TupleSerializer.java:30)
      at org.apache.flink.runtime.io.disk.InputViewIterator.next(InputViewIterator.java:43)
      at org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ReadingThread.go(UnilateralSortMerger.java:973)
      at org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:796)
    
    message KeyDay {
      optional ByteString key = 1;
      optional int64 timestamp_usec = 2;
    }