Google bigquery 用API作业替换BigQuery表_Google Bigquery

Google bigquery 用API作业替换BigQuery表

google-bigquery

Google bigquery 用API作业替换BigQuery表,google-bigquery,Google Bigquery,我正在使用执行数据ETL jpb，然后将数据加载回BigQuery 我希望每次都覆盖目标表，但目前我的代码每次运行时都会向表中添加新数据。我已经阅读了job_config上的文档，并使用它来设置查询的参数，但我不知道如何为查询设置写入配置以下是我迄今为止所尝试的： roc_df = pd.DataFrame(roc_score) job_config.write_disposition = bigquery.WriteDisposition.WRITE_TRUNCATE dataset_r

我正在使用执行数据ETL jpb，然后将数据加载回BigQuery

我希望每次都覆盖目标表，但目前我的代码每次运行时都会向表中添加新数据。我已经阅读了job_config上的文档，并使用它来设置查询的参数，但我不知道如何为查询设置写入配置

以下是我迄今为止所尝试的：

roc_df = pd.DataFrame(roc_score)

job_config.write_disposition = bigquery.WriteDisposition.WRITE_TRUNCATE

dataset_ref = client.dataset('Customers')
table_ref = dataset_ref.table('propensity_scores_test')

client.load_table_from_dataframe(roc_df, table_ref, job_config=job_config).result()

我也尝试过这种格式：

query_config = bigquery.QueryJobConfig(
    query_parameters=[
        bigquery.job.WriteDisposition('WRITE_TRUNCATE')
    ]
)

但两人目前都返回了错误：

请求：400个员额 : 缺少必需的参数

我可以每次写入数据并替换表格吗？

该方法使用。下面是一段有效的代码：

from google.cloud import bigquery
import pandas as pd

roc_df = pd.DataFrame([{"firstName": "Foo", "lastName": "Bar"}])

client = bigquery.Client()

dataset_ref = client.dataset('my_dataset')
table_ref = dataset_ref.table('my_table')

job_config = bigquery.job.LoadJobConfig()
job_config.write_disposition = bigquery.WriteDisposition.WRITE_TRUNCATE

client.load_table_from_dataframe(roc_df, table_ref, job_config=job_config)

对代码的唯一更改是：

job_config = bigquery.job.LoadJobConfig()

该方法使用一个。下面是一段有效的代码：

from google.cloud import bigquery
import pandas as pd

roc_df = pd.DataFrame([{"firstName": "Foo", "lastName": "Bar"}])

client = bigquery.Client()

dataset_ref = client.dataset('my_dataset')
table_ref = dataset_ref.table('my_table')

job_config = bigquery.job.LoadJobConfig()
job_config.write_disposition = bigquery.WriteDisposition.WRITE_TRUNCATE

client.load_table_from_dataframe(roc_df, table_ref, job_config=job_config)

对代码的唯一更改是：

job_config = bigquery.job.LoadJobConfig()

您是否试图从查询或加载作业的结果中写入/截断表？我认为这可能会有所帮助you@GrahamPolley我不确定！我正在使用

train=client.query（training\u query）.to\u dataframe（）

加载数据，不需要任何配置集-我只需要稍后使用配置来设置写入规则，这可能会让事情变得混乱。您是在尝试从查询结果或加载作业写入/截断表吗？我认为这可能会有所帮助you@GrahamPolley我不确定！我正在使用

train=client.query（training\u query）.to\u dataframe（）

加载数据，没有任何配置集-我只需要稍后使用配置来设置写入规则，也许这会让人困惑？