Google bigquery 如何在不清除bigquery表的情况下更新在app engine中运行的google云数据流

Google bigquery 如何在不清除bigquery表的情况下更新在app engine中运行的google云数据流,google-bigquery,google-cloud-dataflow,Google Bigquery,Google Cloud Dataflow,我有一个googleclouddataflow进程在appengine上运行。 它监听通过pubsub发送的消息,并将消息流传输到big query 我更新了代码,正在尝试重新运行应用程序 但我收到了这个错误: Exception in thread "main" java.lang.IllegalArgumentException: BigQuery table is not empty 是否仍然可以在不删除表的情况下更新数据流? 因为我的代码可能经常更改,我不想删除表中的数据 这是我的密码

我有一个
googleclouddataflow
进程在
appengine
上运行。 它监听通过
pubsub
发送的消息,并将消息流传输到
big query

我更新了代码,正在尝试重新运行应用程序

但我收到了这个错误:

Exception in thread "main" java.lang.IllegalArgumentException: BigQuery table is not empty
是否仍然可以在不删除表的情况下更新数据流? 因为我的代码可能经常更改,我不想删除表中的数据

这是我的密码:

public class MyPipline {
    private static final Logger LOG = LoggerFactory.getLogger(BotPipline.class);
    private static String name;

    public static void main(String[] args) {

        List<TableFieldSchema> fields = new ArrayList<>();
        fields.add(new TableFieldSchema().setName("a").setType("string"));
        fields.add(new TableFieldSchema().setName("b").setType("string"));
        fields.add(new TableFieldSchema().setName("c").setType("string"));
        TableSchema tableSchema = new TableSchema().setFields(fields);

        DataflowPipelineOptions options = PipelineOptionsFactory.as(DataflowPipelineOptions.class);
        options.setRunner(BlockingDataflowPipelineRunner.class);
        options.setProject("my-data-analysis");
        options.setStagingLocation("gs://my-bucket/dataflow-jars");
        options.setStreaming(true);

        Pipeline pipeline = Pipeline.create(options);

        PCollection<String> input = pipeline
                .apply(PubsubIO.Read.subscription(
                        "projects/my-data-analysis/subscriptions/myDataflowSub"));

        input.apply(ParDo.of(new DoFn<String, Void>() {

            @Override
            public void processElement(DoFn<String, Void>.ProcessContext c) throws Exception {
                LOG.info("json" + c.element());
            }

        }));
        String fileName = UUID.randomUUID().toString().replaceAll("-", "");


        input.apply(ParDo.of(new DoFn<String, String>() {
            @Override
            public void processElement(DoFn<String, String>.ProcessContext c) throws Exception {
                JSONObject firstJSONObject = new JSONObject(c.element());
                firstJSONObject.put("a", firstJSONObject.get("a").toString()+ "1000");
                c.output(firstJSONObject.toString());

            }

        }).named("update json")).apply(ParDo.of(new DoFn<String, TableRow>() {

            @Override
            public void processElement(DoFn<String, TableRow>.ProcessContext c) throws Exception {
                JSONObject json = new JSONObject(c.element());
                TableRow row = new TableRow().set("a", json.get("a")).set("b", json.get("b")).set("c", json.get("c"));
                c.output(row);
            }

        }).named("convert json to table row"))
                .apply(BigQueryIO.Write.to("my-data-analysis:mydataset.mytable").withSchema(tableSchema)
        );

        pipeline.run();
    }
}
公共类MyPipline{
私有静态最终记录器LOG=LoggerFactory.getLogger(BotPipline.class);
私有静态字符串名;
公共静态void main(字符串[]args){
列表字段=新的ArrayList();
add(new TableFieldSchema().setName(“a”).setType(“string”));
add(新的TableFieldSchema().setName(“b”).setType(“string”));
add(new TableFieldSchema().setName(“c”).setType(“string”));
TableSchema TableSchema=新TableSchema().设置字段(字段);
DataflowPipelineOptions=PipelineOptions工厂.as(DataflowPipelineOptions.class);
options.setRunner(BlockingDataflowPipelineRunner.class);
选项。setProject(“我的数据分析”);
options.setStagingLocation(“gs://my bucket/dataflow jars”);
选项。设置流(true);
Pipeline=Pipeline.create(选项);
PCollection输入=管道
.apply(publisubio.Read.subscription(
“项目/我的数据分析/订阅/myDataflowSub”);
input.apply(ParDo.of(new DoFn)(){
@凌驾
public void processElement(DoFn.ProcessContext c)引发异常{
LOG.info(“json”+c.element());
}
}));
字符串文件名=UUID.randomUUID().toString().replaceAll(“-”,”);
input.apply(ParDo.of(new DoFn)(){
@凌驾
public void processElement(DoFn.ProcessContext c)引发异常{
JSONObject firstJSONObject=新的JSONObject(c.element());
firstJSONObject.put(“a”,firstJSONObject.get(“a”).toString()+“1000”);
c、 输出(firstJSONObject.toString());
}
}).named(“updatejson”).apply(ParDo.of(new DoFn)(){
@凌驾
public void processElement(DoFn.ProcessContext c)引发异常{
JSONObject json=新的JSONObject(c.element());
TableRow row=new TableRow().set(“a”,json.get(“a”)).set(“b”,json.get(“b”).set(“c”,json.get(“c”);
c、 输出(行);
}
}).named(“将json转换为表行”))
.apply(BigQueryIO.Write.to)(“我的数据分析:mydataset.mytable”).withSchema(tableSchema)
);
pipeline.run();
}
}

您需要在
BigQueryIO.Write上指定
的writedisposition
-请参阅文档和。根据您的要求,您需要
WRITE\u TRUNCATE
WRITE\u APPEND
您需要在
BigQueryIO上指定
带writedisposition
。WRITE
-请参阅文档和。根据您的需求,您需要
WRITE\u TRUNCATE
WRITE\u APPEND