Google bigquery 如何在不清除bigquery表的情况下更新在app engine中运行的google云数据流_Google Bigquery_Google Cloud Dataflow

Google bigquery 如何在不清除bigquery表的情况下更新在app engine中运行的google云数据流

google-bigquery google-cloud-dataflow

Google bigquery 如何在不清除bigquery表的情况下更新在app engine中运行的google云数据流,google-bigquery,google-cloud-dataflow,Google Bigquery,Google Cloud Dataflow,我有一个googleclouddataflow进程在appengine上运行。它监听通过pubsub发送的消息，并将消息流传输到big query 我更新了代码，正在尝试重新运行应用程序但我收到了这个错误： Exception in thread "main" java.lang.IllegalArgumentException: BigQuery table is not empty 是否仍然可以在不删除表的情况下更新数据流？因为我的代码可能经常更改，我不想删除表中的数据这是我的密码

我有一个

googleclouddataflow

进程在

appengine

上运行。它监听通过

pubsub

发送的消息，并将消息流传输到

big query

我更新了代码，正在尝试重新运行应用程序

但我收到了这个错误：

Exception in thread "main" java.lang.IllegalArgumentException: BigQuery table is not empty

是否仍然可以在不删除表的情况下更新数据流？因为我的代码可能经常更改，我不想删除表中的数据

这是我的密码：

public class MyPipline {
    private static final Logger LOG = LoggerFactory.getLogger(BotPipline.class);
    private static String name;

    public static void main(String[] args) {

        List<TableFieldSchema> fields = new ArrayList<>();
        fields.add(new TableFieldSchema().setName("a").setType("string"));
        fields.add(new TableFieldSchema().setName("b").setType("string"));
        fields.add(new TableFieldSchema().setName("c").setType("string"));
        TableSchema tableSchema = new TableSchema().setFields(fields);

        DataflowPipelineOptions options = PipelineOptionsFactory.as(DataflowPipelineOptions.class);
        options.setRunner(BlockingDataflowPipelineRunner.class);
        options.setProject("my-data-analysis");
        options.setStagingLocation("gs://my-bucket/dataflow-jars");
        options.setStreaming(true);

        Pipeline pipeline = Pipeline.create(options);

        PCollection<String> input = pipeline
                .apply(PubsubIO.Read.subscription(
                        "projects/my-data-analysis/subscriptions/myDataflowSub"));

        input.apply(ParDo.of(new DoFn<String, Void>() {

            @Override
            public void processElement(DoFn<String, Void>.ProcessContext c) throws Exception {
                LOG.info("json" + c.element());
            }

        }));
        String fileName = UUID.randomUUID().toString().replaceAll("-", "");


        input.apply(ParDo.of(new DoFn<String, String>() {
            @Override
            public void processElement(DoFn<String, String>.ProcessContext c) throws Exception {
                JSONObject firstJSONObject = new JSONObject(c.element());
                firstJSONObject.put("a", firstJSONObject.get("a").toString()+ "1000");
                c.output(firstJSONObject.toString());

            }

        }).named("update json")).apply(ParDo.of(new DoFn<String, TableRow>() {

            @Override
            public void processElement(DoFn<String, TableRow>.ProcessContext c) throws Exception {
                JSONObject json = new JSONObject(c.element());
                TableRow row = new TableRow().set("a", json.get("a")).set("b", json.get("b")).set("c", json.get("c"));
                c.output(row);
            }

        }).named("convert json to table row"))
                .apply(BigQueryIO.Write.to("my-data-analysis:mydataset.mytable").withSchema(tableSchema)
        );

        pipeline.run();
    }
}

公共类MyPipline{
私有静态最终记录器LOG=LoggerFactory.getLogger（BotPipline.class）；
私有静态字符串名；
公共静态void main（字符串[]args）{
列表字段=新的ArrayList（）；
add（new TableFieldSchema（）.setName（“a”）.setType（“string”））；
add（新的TableFieldSchema（）.setName（“b”）.setType（“string”））；
add（new TableFieldSchema（）.setName（“c”）.setType（“string”））；
TableSchema TableSchema=新TableSchema（）.设置字段（字段）；
DataflowPipelineOptions=PipelineOptions工厂.as（DataflowPipelineOptions.class）；
options.setRunner（BlockingDataflowPipelineRunner.class）；
选项。setProject（“我的数据分析”）；
options.setStagingLocation（“gs://my bucket/dataflow jars”）；
选项。设置流（true）；
Pipeline=Pipeline.create（选项）；
PCollection输入=管道
.apply（publisubio.Read.subscription(
“项目/我的数据分析/订阅/myDataflowSub”）；
input.apply（ParDo.of（new DoFn）（）{
@凌驾
public void processElement（DoFn.ProcessContext c）引发异常{
LOG.info（“json”+c.element（））；
}
}));
字符串文件名=UUID.randomUUID（）.toString（）.replaceAll（“-”，”）；
input.apply（ParDo.of（new DoFn）（）{
@凌驾
public void processElement（DoFn.ProcessContext c）引发异常{
JSONObject firstJSONObject=新的JSONObject（c.element（））；
firstJSONObject.put（“a”，firstJSONObject.get（“a”）.toString（）+“1000”）；
c、 输出（firstJSONObject.toString（））；
}
}).named（“updatejson”）.apply（ParDo.of（new DoFn）（）{
@凌驾
public void processElement（DoFn.ProcessContext c）引发异常{
JSONObject json=新的JSONObject（c.element（））；
TableRow row=new TableRow（）.set（“a”，json.get（“a”））.set（“b”，json.get（“b”）.set（“c”，json.get（“c”）；
c、 输出（行）；
}
}).named（“将json转换为表行”））
.apply（BigQueryIO.Write.to）（“我的数据分析：mydataset.mytable”）.withSchema（tableSchema）
);
pipeline.run（）；
}
}

您需要在

BigQueryIO.Write上指定的writedisposition
-请参阅文档和。根据您的要求，您需要WRITE\u TRUNCATE
或WRITE\u APPEND
您需要在BigQueryIO上指定带writedisposition
。WRITE
-请参阅文档和。根据您的需求，您需要WRITE\u TRUNCATE
或WRITE\u APPEND