Google bigquery 可以在python客户端中调用BigQuery过程吗?

Google bigquery 可以在python客户端中调用BigQuery过程吗?,google-bigquery,google-api-python-client,Google Bigquery,Google Api Python Client,BigQuery的脚本编写/过程刚刚在beta版发布——是否可以使用BigQuery python客户端调用过程 我试过: query = """CALL `myproject.dataset.procedure`()....""" job = client.query(query, location="US",) print(job.results()) print(job.ddl_operation_performed) print(job._properties) but that di

BigQuery的脚本编写/过程刚刚在beta版发布——是否可以使用BigQuery python客户端调用过程

我试过:

query = """CALL `myproject.dataset.procedure`()...."""
job = client.query(query, location="US",)
print(job.results())
print(job.ddl_operation_performed)

print(job._properties) but that didn't give me the result set from the procedure. Is it possible to get the results?
谢谢大家!

已编辑-我正在调用的存储过程


如果您在程序中选择了以下步骤,则此选项有效:

创建或替换过程dataset.proc_输出开始 从UNNEST['1','2','3']t中选择t; 终止 代码:

从google.cloud导入bigquery client=bigquery.client query=调用dataset.proc\u输出 job=client.queryquery,location=US 对于job.result中的结果: 打印结果 将输出:

Row((u'1',), {u't': 0})
Row((u'2',), {u't': 0})
Row((u'3',), {u't': 0})
但是,如果一个过程中有多个SELECT,则只能通过这种方式获取最后一个结果集

更新

请参见以下示例:

在国家/地区字符串中,在accessDate日期中,在INT64中创建或替换过程zyun.exist 开始 设置saleExists=数据为选择美国购买国,日期2019-1-1购买日期 从purchaseCountry=country和purchaseDate=accessDate的数据中选择Count*; 如果saleExists=0,则 插入Dataset.MissingSalesTable purchaseCountry、purchaseDate、customerId值country、accessDate、accessId; 如果结束; 终止 开始 声明存在INT64; 致电zyun.existsUS,日期2019-2-1,saleExists; 选择saleExists; 终止 顺便说一句,您的示例最好使用单个脚本而不是脚本。

如果使用SELECT语句跟随CALL命令,则可以将函数的返回值作为结果集。例如,我创建了以下存储过程:

BEGIN
  -- Build an array of the top 100 names from the year 2017.
DECLARE
  top_names ARRAY<STRING>;
SET
  top_names = (
  SELECT
    ARRAY_AGG(name
    ORDER BY
      number DESC
    LIMIT
      100)
  FROM
    `bigquery-public-data.usa_names.usa_1910_current`
  WHERE
    year = 2017 );
  -- Which names appear as words in Shakespeare's plays?
SET
  top_shakespeare_names = (
  SELECT
    ARRAY_AGG(name)
  FROM
    UNNEST(top_names) AS name
  WHERE
    name IN (
    SELECT
      word
    FROM
      `bigquery-public-data.samples.shakespeare` ));
END
运行以下查询将返回过程的返回作为顶级结果集

DECLARE top_shakespeare_names ARRAY<STRING> DEFAULT NULL;
CALL `my-project.test_dataset.top_names`(top_shakespeare_names);
SELECT top_shakespeare_names;
在Python中:

from google.cloud import bigquery

client = bigquery.Client()
query_string = """
DECLARE top_shakespeare_names ARRAY<STRING> DEFAULT NULL;
CALL `swast-scratch.test_dataset.top_names`(top_shakespeare_names);
SELECT top_shakespeare_names;
"""
query_job = client.query(query_string)
rows = list(query_job.result())
print(rows)
相关:如果存储过程中有SELECT语句,则可以遍历作业以获取结果,即使SELECT语句不是过程中的最后一条语句

# TODO(developer): Import the client library.
# from google.cloud import bigquery

# TODO(developer): Construct a BigQuery client object.
# client = bigquery.Client()

# Run a SQL script.
sql_script = """
-- Declare a variable to hold names as an array.
DECLARE top_names ARRAY<STRING>;

-- Build an array of the top 100 names from the year 2017.
SET top_names = (
SELECT ARRAY_AGG(name ORDER BY number DESC LIMIT 100)
FROM `bigquery-public-data.usa_names.usa_1910_2013`
WHERE year = 2000
);

-- Which names appear as words in Shakespeare's plays?
SELECT
name AS shakespeare_name
FROM UNNEST(top_names) AS name
WHERE name IN (
SELECT word
FROM `bigquery-public-data.samples.shakespeare`
);
"""
parent_job = client.query(sql_script)

# Wait for the whole script to finish.
rows_iterable = parent_job.result()
print("Script created {} child jobs.".format(parent_job.num_child_jobs))

# Fetch result rows for the final sub-job in the script.
rows = list(rows_iterable)
print("{} of the top 100 names from year 2000 also appear in Shakespeare's works.".format(len(rows)))

# Fetch jobs created by the SQL script.
child_jobs_iterable = client.list_jobs(parent_job=parent_job)
for child_job in child_jobs_iterable:
    child_rows = list(child_job.result())
    print("Child job with ID {} produced {} rows.".format(child_job.job_id, len(child_rows)))

我的过程中有一个select语句,但也有一个insert语句。那不行吗?我连结果都没有。有一个IF-THEN语句,它查看select语句的结果,如果select的结果为false,则插入,然后返回select@WIT,如果您的预期输出可以表示为数组或结构数组,建议使用OUT参数使输出成为过程接口的一部分。输出不是问题所在,问题是由于脚本编写,它返回2或3个结果集,我无法访问正在运行的bigquery客户端的结果,但无法访问客户端的输出。您希望捕获哪个语句的输出?从dataset.table中选择1,其中purchaseCountry=country,purchaseDate=accessDate,customerId=accessId?为什么当前的过程对您不起作用?或者只是saleExists 1/0,这与从dataset中选择1基本相同。Table更新了我的答案,这也简化了您的过程体。
# TODO(developer): Import the client library.
# from google.cloud import bigquery

# TODO(developer): Construct a BigQuery client object.
# client = bigquery.Client()

# Run a SQL script.
sql_script = """
-- Declare a variable to hold names as an array.
DECLARE top_names ARRAY<STRING>;

-- Build an array of the top 100 names from the year 2017.
SET top_names = (
SELECT ARRAY_AGG(name ORDER BY number DESC LIMIT 100)
FROM `bigquery-public-data.usa_names.usa_1910_2013`
WHERE year = 2000
);

-- Which names appear as words in Shakespeare's plays?
SELECT
name AS shakespeare_name
FROM UNNEST(top_names) AS name
WHERE name IN (
SELECT word
FROM `bigquery-public-data.samples.shakespeare`
);
"""
parent_job = client.query(sql_script)

# Wait for the whole script to finish.
rows_iterable = parent_job.result()
print("Script created {} child jobs.".format(parent_job.num_child_jobs))

# Fetch result rows for the final sub-job in the script.
rows = list(rows_iterable)
print("{} of the top 100 names from year 2000 also appear in Shakespeare's works.".format(len(rows)))

# Fetch jobs created by the SQL script.
child_jobs_iterable = client.list_jobs(parent_job=parent_job)
for child_job in child_jobs_iterable:
    child_rows = list(child_job.result())
    print("Child job with ID {} produced {} rows.".format(child_job.job_id, len(child_rows)))