Google bigquery BigQueryIO write无法添加新字段，即使已设置“允许添加字段”_Google Bigquery_Apache Beam_Apache Beam Io

Google bigquery BigQueryIO write无法添加新字段，即使已设置“允许添加字段”

google-bigquery

Google bigquery BigQueryIO write无法添加新字段，即使已设置“允许添加字段”,google-bigquery,apache-beam,apache-beam-io,Google Bigquery,Apache Beam,Apache Beam Io,我使用Apache Beam的BigqueryIO加载到bigquery中，但加载作业失败，出现错误： "message": "Error while reading data, error message: JSON parsing error in row starting at position 0: No such field: Field_name.", 以下是加载作业的完整配置： "configuration"

我使用Apache Beam的BigqueryIO加载到bigquery中，但加载作业失败，出现错误：

"message": "Error while reading data, error message: JSON parsing error in row starting at position 0: No such field: Field_name.",

以下是加载作业的完整配置：

      "configuration": {
    "jobType": "LOAD",
    "load": {
      "createDisposition": "CREATE_NEVER",
      "destinationTable": {
        "datasetId": "people",
        "projectId": "my_project",
        "tableId": "beam_load_test"
      },
      "ignoreUnknownValues": false,
      "schema": {
        "fields": [
          {
            "mode": "NULLABLE",
            "name": "First_name",
            "type": "STRING"
          },
          {
            "mode": "NULLABLE",
            "name": "Last_name",
            "type": "STRING"
          }
        ]
      },
      "schemaUpdateOptions": [
        "ALLOW_FIELD_ADDITION"
      ],
      "sourceFormat": "NEWLINE_DELIMITED_JSON",
      "sourceUris": [
        "gs://tmp_bucket/BigQueryWriteTemp/beam_load/043518a3-7bae-48ac-8068-f97430c32f58"
      ],
      "useAvroLogicalTypes": false,
      "writeDisposition": "WRITE_APPEND"
    }

我可以看到它在GSC中创建的临时文件看起来应该是这样的，并且还提供了模式，并且正在使用useBeamSchema（）进行推断

以下是我写入BigQuery的管道代码：

pipeline.apply(
            "Write data to BQ",
            BigQueryIO
                    .<GenericRecord>write()
                    .optimizedWrites()
                    .useBeamSchema()
                    .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_NEVER)
                    .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
                    .withSchemaUpdateOptions(ImmutableSet.of(BigQueryIO.Write.SchemaUpdateOption.ALLOW_FIELD_ADDITION))
                    .withCustomGcsTempLocation(options.getGcsTempLocation())
                    .withNumFileShards(options.getNumShards().get())
                    .withMethod(FILE_LOADS)
                    .withTriggeringFrequency(Utils.parseDuration("10s"))
                    .to(new TableReference()
                            .setProjectId(options.getGcpProjectId().get())
                            .setDatasetId(options.getGcpDatasetId().get())
                            .setTableId(options.getGcpTableId().get()))
    )

pipeline.apply(
“将数据写入BQ”，
比格奎奥
.write（）
.optimizedWrites（）
.useBeamSchema（）
.withCreateDisposition（BigQueryIO.Write.CreateDisposition.CREATE\u NEVER）
.withWriteDisposition（BigQueryIO.Write.WriteDisposition.Write\u追加）
.WithChemaUpdateOptions（ImmutableSet.of（BigQueryIO.Write.SchemaUpdateOption.ALLOW_FIELD_ADDITION））
.withCustomGcsTempLocation（options.getGcsTempLocation（））
.withNumFileShards（options.getNumShards（）.get（））
.withMethod（文件加载）
.带触发频率（Utils.parseDuration（“10s”））
.to（新表格参考（）
.setProjectId（options.getGcpProjectId（）.get（））
.setDatasetId（options.getGcpDatasetId（）.get（））
.setTableId（options.getGcpTableId（）.get（））
)

关于为什么不添加新字段有什么想法吗？

您可以共享相关的管道代码，扩展

BigqueryIO

类主体吗？@mk_sta，我已经添加了写入BigQueryId的管道代码。您定义了字段名称吗

如果在JSON文件中指定模式，则必须在其中定义新列。如果缺少新的列定义，则在尝试追加数据时会返回以下错误：读取数据时出错，错误消息：从int位置开始的行中分析错误：无此类字段：字段。

只要您在

作业上加载数据。insert

方法，在我看来，@Peter Kim的解决方案是合理的。您在输入文件中指定了吗？@artofdoe，您的问题解决了吗！？