Google bigquery 切分BigQuery输出表_Google Bigquery_Google Cloud Dataflow_Apache Beam_Apache Beam Io

Google bigquery 切分BigQuery输出表

google-bigquery google-cloud-dataflow

Google bigquery 切分BigQuery输出表,google-bigquery,google-cloud-dataflow,apache-beam,apache-beam-io,Google Bigquery,Google Cloud Dataflow,Apache Beam,Apache Beam Io,我从文档中阅读了这两篇文章，并且从中可以动态地确定表目的地。我使用了完全类似的方法，如下所示： PCollection<Foo> foos = ...; foos.apply(BigQueryIO.write().to(new SerializableFunction<ValueInSingleWindow<Foo>, TableDestination>() { @Override public TableDestination apply(Value

我从文档中阅读了这两篇文章，并且从中可以动态地确定表目的地。我使用了完全类似的方法，如下所示：

PCollection<Foo> foos = ...;
foos.apply(BigQueryIO.write().to(new SerializableFunction<ValueInSingleWindow<Foo>, TableDestination>() {
  @Override
  public TableDestination apply(ValueInSingleWindow<Foo> value) {  
    Foo foo = value.getValue();
    // Also available: value.getWindow(), getTimestamp(), getPane()
    String tableSpec = ...;
    String tableDescription = ...;
    return new TableDestination(tableSpec, tableDescription);
  }
}).withFormatFunction(new SerializableFunction<Foo, TableRow>() {
  @Override
  public TableRow apply(Foo foo) {
    return ...;
  }
}).withSchema(...));

PCollection foos=。。。；
apply（BigQueryIO.write（）.to）（新的SerializableFunction（）{
@凌驾
公用表目标应用（值单个窗口值）{
Foo-Foo=value.getValue（）；
//也可用：value.getWindow（）、getTimestamp（）、getPane（）
字符串tableSpec=。。。；
字符串tableDescription=。。。；
返回新的TableDestination（tableSpec、tableDescription）；
}
}).withFormatFunction（新的SerializableFunction（）{
@凌驾
公共表格行应用（Foo-Foo）{
返回。。。；
}
}).使用Schema（…）；

但是，我得到以下编译错误：

The method to(String) in the type BigQueryIO.Write<Object> is not applicable for the arguments (new SerializableFunction<ValueInSingleWindow<Foo>,TableDestination>(){})

类型BigQueryIO.Write中的（字符串）方法不适用于参数（新的SerializableFunction（）{}）

任何帮助都将不胜感激

编辑以澄清我在案例中如何使用窗口：

PCollection<Foo> validFoos = ...;           
PCollection<TableRow> validRows = validFoos.apply(ParDo.named("Convert Foo to table row")
        .of(new ConvertToValidTableRowFn()))
        .setCoder(TableRowJsonCoder.of());
TableSchema validSchema = ConvertToValidTableRowFn.getSchema();    

validRows.apply(Window.<TableRow>into(CalendarWindows.days(1))).apply(BigQueryIO.writeTableRows()
        .to(new SerializableFunction<ValueInSingleWindow<TableRow>, TableDestination>() {
            @Override
            public TableDestination apply(ValueInSingleWindow<TableRow> value) {
                TableRow t = value.getValue();
                String fooName = ""; // get name from table
                TableDestination td = new TableDestination(
                        "my-project:dataset.table$" + fooName, "");
                return td;
            }
        }));

PCollection validFoos=。。。；
PCollection validRows=validFoos.apply（ParDo.named（“Convert Foo to table row”）
.of（新的ConvertToValidTableRowFn（））
.setCoder（TableRowJsonCoder.of（））；
TableSchema validSchema=ConvertToValidTableRowFn.getSchema（）；
validRows.apply（Window.into（CalendarWindows.days（1））.apply（BigQueryIO.writeTableRows（））
.to（新的SerializableFunction（）{
@凌驾
公用表目标应用（值单个窗口值）{
TableRow t=value.getValue（）；
String fooName=“；//从表中获取名称
TableDestination td=新的TableDestination(
“我的项目：dataset.table$”+fooName，”；
返回td；
}
}));

在本例中，我得到了以下错误

方法apply（ptTransform我相信编译错误来自这样一个事实：您在PCollection上执行此操作，而实际上它需要窗口化的值。
因此，您应该首先使用.apply（Window.into（…）
，然后根据您的窗口确定表目标
您可以在或以及您提到的中看到示例。
您使用的SDK版本是什么？来自其他帖子-“此功能将包含在Apache Beam的第一个稳定版本中，并包含在Dataflow SDK的下一个版本中（将基于Apache Beam的第一个稳定版本）。现在，您可以通过对github的Beam at HEAD快照运行管道来使用此功能。“我使用的是5月发布的Apache Beam 2.0.0稳定版本。在其文档中，据说包含了此功能。请参阅中的切分部分。我刚刚遇到了处理此问题的方法。它在语法上有一些不同（返回TableReference
s而不是TableDestination
s），并将代码分成一个类（使其更干净）。我自己没有测试过它（过去我使用了与您类似的代码），但我希望这能有所帮助。我自己用你的代码很快地尝试过，它似乎没有给我任何错误。你能检查你的POM并确保你的BEAM版本是2.1.0-SNAPSHOT吗？谢谢你的回复，但当我应用中所描述的窗口时，这次我得到了方法应用（pTransferMyou应该将代码更改为（Window.into（..）
，因为在您的情况下，您有一个PCollection
，您使用更通用的write（）
方法，而应答代码使用PCollection
和writeTableRows（）
方法。是的，但在本例中，在使用此函数之前，我将我的PCollection foos
转换为PCollection foorrows
。可以尝试以下代码段，即：PCollection quotes=…quotes.apply（Window.into（CalendarWindows.days（1））.apply（BigQueryIO.writeTableRows（）.withSchema（schema）。to（new SerializableFunction（）{public String apply（ValueInSingleWindow value）{…}））；
代码的哪一部分没有完全为您编译？我也这样做了，但在第一次apply
中我得到了相同的错误。它抱怨方法apply（pttransform>）