Java BigQueryIO-为每个项目写入两个表_Java_Google Bigquery_Google Cloud Dataflow_Apache Beam

Java BigQueryIO-为每个项目写入两个表

java google-bigquery google-cloud-dataflow

Java BigQueryIO-为每个项目写入两个表,java,google-bigquery,google-cloud-dataflow,apache-beam,Java,Google Bigquery,Google Cloud Dataflow,Apache Beam,我正在尝试使用ApacheBeam在数据流中编写作业。此作业需要获取一个输入并将其转换为我的自定义对象。这个对象表示一个内存测试，它包含固定的属性，如时间戳、名称。。。以及分区及其属性的列表 public class TestResult { String testName; String testId; String testStatus; String testResult; List<Partition> testPartition

我正在尝试使用ApacheBeam在数据流中编写作业。此作业需要获取一个输入并将其转换为我的自定义对象。这个对象表示一个内存测试，它包含固定的属性，如时间戳、名称。。。以及分区及其属性的列表

public class TestResult {

    String testName;
    String testId;
    String testStatus;
    String testResult;
    List<Partition> testPartitions;
}
public class Partition {
    String testId;
    String filesystem;
    String mountedOn;
    String usePercentage;
    String available;
    String size;
    String used;
}

编辑：

这是我的管道：

pipeline.apply("ReadFromPubSubToBigQuery_MemoryTest", PubsubIO.readMessagesWithAttributes().fromTopic(options.getPubsubTopic()))
    .apply("MemoryTest_ProcessObject", ParDo.of(new ProcessTestResult()))
    .apply("MemoryTest_IdentifyMemoryTest",ParDo.of(new DetectTestType()))
    .apply("MemoryTest_TransformIntoTableRow", ParDo.of(new TestResultToRowConverter()).withOutputTags(partitionsTag))
    .apply("MemoryTest_WriteToBigQuery", BigQueryIO.writeTableRows().to(TestResultToRowConverter.getTableSpec1())
        .withSchema(TestResultToRowConverter.getMemoryTestSchema())
        .withWriteDisposition(WriteDisposition.WRITE_APPEND))

梁管道并不局限于一条接一条应用变换的直线，这将是非常有限的

可以将任意多个变换应用于任何PCollection

梁管道并不局限于一条接一条应用变换的直线，这将是非常有限的

可以将任意多个变换应用于任何PCollection

我如何使其适应我的管道？我编辑了我的OP，以便您可以看到它的外观。只需将这些内容提取到MemoryTest_TransformIntoTableRow中，并将其转换为PCollection类型的变量。许多梁示例将PCollection传递给多个变换。例如，看这个：下面使用gameEvents来计算团队分数和用户分数。我如何使其适应我的管道？我编辑了我的OP，以便您可以看到它的外观。只需将这些内容提取到MemoryTest_TransformIntoTableRow中，并将其转换为PCollection类型的变量。许多梁示例将PCollection传递给多个变换。例如，请看这个：下面使用gameEvents来计算团队分数和用户分数。

    .apply("MemoryTest_WriteToBigQuery", BigQueryIO.writeTableRows().to(TestResultToRowConverter.getTableSpec1())
        .withSchema(TestResultToRowConverter.getMemoryTestSchema())
        .withWriteDisposition(WriteDisposition.WRITE_APPEND))

pipeline.apply("ReadFromPubSubToBigQuery_MemoryTest", PubsubIO.readMessagesWithAttributes().fromTopic(options.getPubsubTopic()))
    .apply("MemoryTest_ProcessObject", ParDo.of(new ProcessTestResult()))
    .apply("MemoryTest_IdentifyMemoryTest",ParDo.of(new DetectTestType()))
    .apply("MemoryTest_TransformIntoTableRow", ParDo.of(new TestResultToRowConverter()).withOutputTags(partitionsTag))
    .apply("MemoryTest_WriteToBigQuery", BigQueryIO.writeTableRows().to(TestResultToRowConverter.getTableSpec1())
        .withSchema(TestResultToRowConverter.getMemoryTestSchema())
        .withWriteDisposition(WriteDisposition.WRITE_APPEND))

PCollection<TableRow> rows = ...;
rows.apply(BigQueryIO.writeTableRows().to(first table));
rows.apply(BigQueryIO.writeTableRows().to(second table));
rows.apply(some more processing)
    .apply(BigQueryIO.writeTableRows().to(third table));