在读取BigQuery数据集python SDK的数据流中指定区域
我正在尝试读取数据流中的bigquery数据集。它找不到我指定的bigquery数据集/表 作业名称为预处理-ga360-190523-130005 我的datalab vm、gcs bucket和bigquery数据集都位于europe-west2 出于某种原因,它正在搜索位置“US”中的数据集在读取BigQuery数据集python SDK的数据流中指定区域,python,google-cloud-dataflow,apache-beam,Python,Google Cloud Dataflow,Apache Beam,我正在尝试读取数据流中的bigquery数据集。它找不到我指定的bigquery数据集/表 作业名称为预处理-ga360-190523-130005 我的datalab vm、gcs bucket和bigquery数据集都位于europe-west2 出于某种原因,它正在搜索位置“US”中的数据集 modules versions are apache-beam 2.5.0,google-cloud-dataflow 2.0.0, google-cloud-bigquery 0.25.0 搜索
modules versions are apache-beam 2.5.0,google-cloud-dataflow 2.0.0, google-cloud-bigquery 0.25.0
搜索了文档,但无法找到发生这种情况的原因的答案
OUTPUT_DIR = "gs://some-bucket/some-folder/"
#dictionary of pipeline options
options = {
"staging_location": "gs://some-bucket/some-folder/stage/"
"temp_location": "gs://some-bucket/some-folder/tmp/"
"job_name": job_name,
"project": PROJECT,
"runner": "DirectRunner",
"location":'europe-west2',
"region":'europe-west2',
}
#instantiate PipelineOptions object using options dictionary
opts = beam.pipeline.PipelineOptions(flags = [], **options)
#instantantiate Pipeline object using PipelineOptions
with beam.Pipeline(options=opts) as
outfile = "gs://some-bucket/some-folder/train.csv"
(
p | "read_train" >> beam.io.Read(beam.io.BigQuerySource(query =
my_query, use_standard_sql = True))
| "tocsv_train" >> beam.Map(to_csv)
| "write_train" >> beam.io.Write(beam.io.WriteToText(outfile))
)
print("Done")
答复:
HttpError:HttpError访问
:
答复:,
Apache Beam 2.5.0 Python SDK中的内容 看起来Apache Beam 2.8.0 Python SDK[,]中添加了支持