Java 与序列化编码器相关的云数据流的运行时异常

Java 与序列化编码器相关的云数据流的运行时异常,java,google-cloud-dataflow,Java,Google Cloud Dataflow,我在运行数据流管道时遇到了一个奇怪的问题。我已经编写了自己的编码器,但是使用AvroCoder、SerializableCoder和其他示例将其替换掉也产生了同样的问题 在尝试以流模式使用数据流服务启动管道后,我遇到了一个例外情况: Exception in thread "main" java.lang.RuntimeException: Unable to deserialize Coder: ModelCoder. Check that a suitable constructor is

我在运行数据流管道时遇到了一个奇怪的问题。我已经编写了自己的编码器,但是使用AvroCoder、SerializableCoder和其他示例将其替换掉也产生了同样的问题

在尝试以流模式使用数据流服务启动管道后,我遇到了一个例外情况:

Exception in thread "main" java.lang.RuntimeException: Unable to deserialize Coder: ModelCoder. Check that a suitable constructor is defined.  See Coder for details.
  at com.google.cloud.dataflow.sdk.util.SerializableUtils.ensureSerializable(SerializableUtils.java:113)
  at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.ensureCoderSerializable(DirectPipelineRunner.java:901)
  at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.ensurePCollectionEncodable(DirectPipelineRunner.java:861)
  at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.setPCollectionValuesWithMetadata(DirectPipelineRunner.java:789)
  at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.setPCollection(DirectPipelineRunner.java:776)
  at com.google.cloud.dataflow.sdk.io.TextIO.evaluateReadHelper(TextIO.java:786)
  at com.google.cloud.dataflow.sdk.io.TextIO.access$000(TextIO.java:118)
  at com.google.cloud.dataflow.sdk.io.TextIO$Read$Bound$1.evaluate(TextIO.java:327)
  at com.google.cloud.dataflow.sdk.io.TextIO$Read$Bound$1.evaluate(TextIO.java:323)
  at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.visitTransform(DirectPipelineRunner.java:706)
  at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:219)
  at com.google.cloud.dataflow.sdk.runners.TransformTreeNode.visit(TransformTreeNode.java:215)
  at com.google.cloud.dataflow.sdk.runners.TransformHierarchy.visit(TransformHierarchy.java:102)
  at com.google.cloud.dataflow.sdk.Pipeline.traverseTopologically(Pipeline.java:252)
  at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner$Evaluator.run(DirectPipelineRunner.java:662)
  at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:374)
  at com.google.cloud.dataflow.sdk.runners.DirectPipelineRunner.run(DirectPipelineRunner.java:87)
  at com.google.cloud.dataflow.sdk.Pipeline.run(Pipeline.java:174)
  at io.momentum.demo.models.pipeline.PlatformPipeline.main(PlatformPipeline.java:96)
Caused by: java.lang.IllegalStateException: Sub-class com.google.cloud.dataflow.sdk.util.CoderUtils$Jackson2Module$Resolver MUST implement `typeFromId(DatabindContext,String)
  at com.fasterxml.jackson.databind.jsontype.impl.TypeIdResolverBase.typeFromId(TypeIdResolverBase.java:77)
  at com.fasterxml.jackson.databind.jsontype.impl.TypeDeserializerBase._findDeserializer(TypeDeserializerBase.java:156)
  at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:106)
  at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:91)
  at com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:142)
  at com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:42)
  at com.fasterxml.jackson.databind.ObjectMapper._readValue(ObjectMapper.java:3760)
  at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2042)
  at com.fasterxml.jackson.databind.ObjectMapper.treeToValue(ObjectMapper.java:2529)
  at com.google.cloud.dataflow.sdk.util.Serializer.deserialize(Serializer.java:98)
  at com.google.cloud.dataflow.sdk.util.SerializableUtils.ensureSerializable(SerializableUtils.java:110)
  ... 18 more
我的实现
Coder
只需包装
AvroCoder
并与我们自己的一些代码挂钩:

public final class ModelCoder<M extends AppModel> extends AtomicCoder<M> {
  public static <T extends AppModel> ModelCoder<T> of(Class<T> clazz) {
    return new ModelCoder<>(clazz);
  }

  @JsonCreator
  @SuppressWarnings("unchecked")
  public static ModelCoder<?> of(@JsonProperty("kind") String classType) throws ClassNotFoundException {
    Class<?> clazz = Class.forName(classType);
    return of((Class<? extends AppModel>) clazz);
  }

  private String kind;

  public ModelCoder(Class<M> type) {
    this.kind = type.getSimpleName();
  }

  @Override
  public void encode(M value, OutputStream outStream, Context context) throws IOException, CoderException {
    CoderInternals.encode(value, outStream, context, new TypeReference<TypedSerializedModel<M>>() { });
  }

  @Override
  public M decode(InputStream inStream, Context context) throws IOException, CoderException {
    return CoderInternals.decode(inStream, context, new TypeReference<TypedSerializedModel<M>>() { });
  }

  @Override
  public CloudObject asCloudObject() {
    CloudObject co = super.asCloudObject();
    co.set("kind", kind);
    return co;
  }
}
公共最终类ModelCoder扩展了AtomicCoder{
公共静态模型编码器(clazz类){
返回新的ModelCoder(clazz);
}
@JsonCreator
@抑制警告(“未选中”)
(@JsonProperty(“kind”)字符串类类型)的公共静态ModelCoder引发ClassNotFoundException{
Class clazz=Class.forName(classType);

返回((Class您需要一个带有@JsonCreator标记的静态方法,以便服务可以在worker上实例化您的编码器。您也不应该覆盖asCloudObject();它决定如何序列化您的编码器并将其发送给worker,而您的代码只会发送一个序列化的AvroCoder


例如,查看NullableCoder.java()以获得一个封装另一个编码器的示例。

您需要一个带有@JsonCreator标记的静态方法,以便服务可以在worker上实例化您的编码器。您也不应该覆盖asCloudObject();这决定了您的代码将如何序列化并发送给工作程序,而您的代码将只发送一个序列化的AvroCoder


例如,看看NullableCoder.java()以一个封装另一个的编码器为例。

感谢@danielm的指针。我编辑了我的代码,以更准确地反映我正在做的事情。
AppModel
是一个抽象的多态库,所以我添加了
kind
属性,以便在反序列化过程中解析类型。我的编码器现在使用非常简单的对象,并有一个
@JsonCreator
(见上文),但仍会产生所示的异常。Jackson 2.5.0+要求TypeIdResolvers实现typeFromId(DatabindContext,String)Dataflow SDK不在其中。请验证您是否正在使用Jackson 2.4.5。感谢@danielm的指针。我已编辑了我的代码,以更准确地反映我正在尝试执行的操作。
AppModel
是一个抽象多态基,因此我正在添加
kind
属性,以便能够在反序列化过程中解析类型.My coder现在使用非常简单的对象,并且有一个
@JsonCreator
(见上文),但仍然会产生所示的异常。Jackson 2.5.0+要求TypeIdResolvers实现Dataflow SDK不包含的typeFromId(DatabindContext,String)。请验证您正在使用Jackson 2.4.5。