Google cloud platform BIGQUERY csv文件加载，附加一列默认值_Google Cloud Platform_Google Bigquery

Google cloud platform BIGQUERY csv文件加载，附加一列默认值

google-cloud-platform google-bigquery

Google cloud platform BIGQUERY csv文件加载，附加一列默认值,google-cloud-platform,google-bigquery,Google Cloud Platform,Google Bigquery,从Google给出的示例中，我按照指南（下面的链接和代码）成功地将CSV文件加载到BigQuery（BQ）表中现在我想在BQ中添加几个文件，并想添加一个新列filename，其中包含文件名有没有办法添加带有默认数据的列您有多种选择：您可以使用文件名作为列数据重建CSV 您可以将数据加载到临时表中，然后通过第二步指定缺少的文件名列移动到最终表将示例转换为一个外部表，其中\u FILE\u NAME是一个psedoocolumn，稍后您可以查询并移动到最终表。了解更多有关此的信息我想说

从Google给出的示例中，我按照指南（下面的链接和代码）成功地将CSV文件加载到BigQuery（BQ）表中
现在我想在BQ中添加几个文件，并想添加一个新列

filename

，其中包含文件名

有没有办法添加带有默认数据的列

您有多种选择：

您可以使用文件名作为列数据重建CSV

您可以将数据加载到临时表中，然后通过第二步指定缺少的文件名列移动到最终表

将示例转换为一个外部表，其中

\u FILE\u NAME

是一个psedoocolumn，稍后您可以查询并移动到最终表。了解更多有关此的信息

我想说你有几个选择

在上传前向CSV添加一列，例如在JS中进行预处理

将各个CSV文件添加到单独的表中。在BigQuery中，您可以轻松地创建一个。通过这种方式，您可以很容易地看到哪些数据来自哪个文件，并且可以访问文件名的数据

通过在使用常规sql/调用加载数据后添加列，对数据进行后期处理

另请参见此可能的副本

根据BigQuery的文档[1]，没有为列设置默认值的选项。没有任何后期处理的最接近的选项是对可空列使用空值

但是，一种可能的后处理解决方法是创建原始表的视图，并添加一个将空值映射到任何默认值的脚本。下面是关于BigQuery[2]中脚本编写的一些信息

如果可以添加预处理代码，那么使用任何脚本语言都可以轻松地将值添加到源文件中

我认为静态和基于函数的值将是BigQuery未来范围的一个好特性

[1]-

[2]-

谢谢。我会选择第二种。

// Import the Google Cloud client libraries
const {BigQuery} = require('@google-cloud/bigquery');
const {Storage} = require('@google-cloud/storage');

// Instantiate clients
const bigquery = new BigQuery();
const storage = new Storage();

/**
 * This sample loads the CSV file at
 * https://storage.googleapis.com/cloud-samples-data/bigquery/us-states/us-states.csv
 *
 * TODO(developer): Replace the following lines with the path to your file.
 */
const bucketName = 'cloud-samples-data';
const filename = 'bigquery/us-states/us-states.csv';

async function loadCSVFromGCS() {
  // Imports a GCS file into a table with manually defined schema.

  /**
   * TODO(developer): Uncomment the following lines before running the sample.
   */
  // const datasetId = 'my_dataset';
  // const tableId = 'my_table';

  // Configure the load job. For full list of options, see:
  // https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad
  const metadata = {
    sourceFormat: 'CSV',
    skipLeadingRows: 1,
    schema: {
      fields: [
        {name: 'name', type: 'STRING'},
        {name: 'post_abbr', type: 'STRING'},
//      {name: 'filemame', type: 'STRING', value=filename} // I WANT TO ADD COLUMN WITH FILE NAME HERE
      ],
    },
    location: 'US',
  };

  // Load data from a Google Cloud Storage file into the table
  const [job] = await bigquery
    .dataset(datasetId)
    .table(tableId)
    .load(storage.bucket(bucketName).file(filename), metadata);

  // load() waits for the job to finish
  console.log(`Job ${job.id} completed.`);

  // Check the job's status for errors
  const errors = job.status.errors;
  if (errors && errors.length > 0) {
    throw errors;
  }
}