Google cloud dataflow 在ApacheBeamJava中与自定义POJOJava类一起使用List时收到如此多的警告
我不熟悉ApacheBeam,我使用ApacheBeam,并作为运行者在GCP中使用数据流Google cloud dataflow 在ApacheBeamJava中与自定义POJOJava类一起使用List时收到如此多的警告,google-cloud-dataflow,dataflow,apache-beam,apache-beam-internals,Google Cloud Dataflow,Dataflow,Apache Beam,Apache Beam Internals,我不熟悉ApacheBeam,我使用ApacheBeam,并作为运行者在GCP中使用数据流 coder of type class org.apache.beam.sdk.coders.ListCoder has a #structuralValue method which does not return true when the encoding of the elements is equal. Element [Person [businessDay=01042020, departm
coder of type class org.apache.beam.sdk.coders.ListCoder has a #structuralValue method which does not return true when the encoding of the elements is equal. Element [Person [businessDay=01042020, departmentId=101, endTime=2020-04-01T09:06:02.000Z, companyId=242, startTime=2020-04-01T09:00:33.000Z], Person [businessDay=01042020, departmentId=101, endTime=2020-04-01T09:07:47.000Z, companyId=242, startTime=2020-04-01T09:06:03.000Z], Person [businessDay=01042020, departmentId=101, endTime=2020-04-01T09:48:25.000Z, companyId=242, startTime=2020-04-01T09:07:48.000Z]]
PCollection类似于PCollection根目录下运行/gradlew run
,则可以验证效果。您还可以使用/gradlew run--args='--runner=DataflowRunner--project=$YOUR_project\u ID--templation=gs://xxx/staging--stagingLocation=gs://xxx/staging'
在数据流上运行它
如果从头开始构建Person
类,则该类应如下所示:
class Person implements Serializable {
public Person(
String businessDay,
String departmentId,
String companyId
) {
this.businessDay = businessDay;
this.departmentId = departmentId;
this.companyId = companyId;
}
public String companyId() {
return companyId;
}
public String businessDay() {
return businessDay;
}
public String departmentId() {
return departmentId;
}
@Override
public boolean equals(Object other) {
if (this == other) {
return true;
}
if (other == null) {
return false;
}
if (getClass() != other.getClass()) {
return false;
}
Person otherPerson = (Person) other;
return this.businessDay.equals(otherPerson.businessDay)
&& this.departmentId.equals(otherPerson.departmentId)
&& this.companyId.equals(otherPerson.companyId);
}
@Override
public int hashCode(){
return Objects.hash(this.businessDay, this.departmentId, this.companyId);
}
private final String businessDay;
private final String departmentId;
private final String companyId;
}
我推荐
- 使用而不是从头开始创建POJO。这里有一些。您可以查看整个项目。优点是每次创建新对象类型时,不必从头开始实现
equals
和hashCode
- 在KV中,如果键是一个iterable(如列表),则将其包装在一个对象中,并显式确定地序列化它(),因为Java中的序列化是不确定的
感谢您的回答,在Pcollection key为String only时,值为List,如我在问题描述中所述。感谢您的回答,在Pcollection key为String only时,值为List,如我在问题描述中所述。看起来,当执行数据流时,列表中的所有对象都是相同的(Person对象)因此,它发出了返回true的警告,但Person对象属性值不同。我不知道如何解决此警告。我猜equals方法可能没有正确实现。这是一张工作票。当equals
未实现时,我确实看到了该警告。但一旦equals
就位,警告就消失了。感谢equals方法中出现了一个小错误。比较整数。我接受了答案..得到了另一个与equals方法相关的警告。无法验证BoundedSource类型的序列化元素是否定义了equals方法。我已经像if一样重写了equals方法(obj Person实例){Person otherPerson=(Person)other;返回Objects.equals(this.businessDay,otherPerson.businessDay)和&Objects.equals(this.departmentId,otherPerson.departmentId)和&Objects.equals(this.companyId,otherPerson.companyId)}equals方法有什么问题。无法使用AutoValue,我将不得不在许多地方进行更改,甚至其他团队成员也在使用该类。