Google bigquery 可以在python客户端中调用BigQuery过程吗?
BigQuery的脚本编写/过程刚刚在beta版发布——是否可以使用BigQuery python客户端调用过程 我试过:Google bigquery 可以在python客户端中调用BigQuery过程吗?,google-bigquery,google-api-python-client,Google Bigquery,Google Api Python Client,BigQuery的脚本编写/过程刚刚在beta版发布——是否可以使用BigQuery python客户端调用过程 我试过: query = """CALL `myproject.dataset.procedure`()....""" job = client.query(query, location="US",) print(job.results()) print(job.ddl_operation_performed) print(job._properties) but that di
query = """CALL `myproject.dataset.procedure`()...."""
job = client.query(query, location="US",)
print(job.results())
print(job.ddl_operation_performed)
print(job._properties) but that didn't give me the result set from the procedure. Is it possible to get the results?
谢谢大家!
已编辑-我正在调用的存储过程
如果您在程序中选择了以下步骤,则此选项有效: 创建或替换过程dataset.proc_输出开始 从UNNEST['1','2','3']t中选择t; 终止 代码: 从google.cloud导入bigquery client=bigquery.client query=调用dataset.proc\u输出 job=client.queryquery,location=US 对于job.result中的结果: 打印结果 将输出:
Row((u'1',), {u't': 0})
Row((u'2',), {u't': 0})
Row((u'3',), {u't': 0})
但是,如果一个过程中有多个SELECT,则只能通过这种方式获取最后一个结果集
更新
请参见以下示例:
在国家/地区字符串中,在accessDate日期中,在INT64中创建或替换过程zyun.exist
开始
设置saleExists=数据为选择美国购买国,日期2019-1-1购买日期
从purchaseCountry=country和purchaseDate=accessDate的数据中选择Count*;
如果saleExists=0,则
插入Dataset.MissingSalesTable purchaseCountry、purchaseDate、customerId值country、accessDate、accessId;
如果结束;
终止
开始
声明存在INT64;
致电zyun.existsUS,日期2019-2-1,saleExists;
选择saleExists;
终止
顺便说一句,您的示例最好使用单个脚本而不是脚本。如果使用SELECT语句跟随CALL命令,则可以将函数的返回值作为结果集。例如,我创建了以下存储过程:
BEGIN
-- Build an array of the top 100 names from the year 2017.
DECLARE
top_names ARRAY<STRING>;
SET
top_names = (
SELECT
ARRAY_AGG(name
ORDER BY
number DESC
LIMIT
100)
FROM
`bigquery-public-data.usa_names.usa_1910_current`
WHERE
year = 2017 );
-- Which names appear as words in Shakespeare's plays?
SET
top_shakespeare_names = (
SELECT
ARRAY_AGG(name)
FROM
UNNEST(top_names) AS name
WHERE
name IN (
SELECT
word
FROM
`bigquery-public-data.samples.shakespeare` ));
END
运行以下查询将返回过程的返回作为顶级结果集
DECLARE top_shakespeare_names ARRAY<STRING> DEFAULT NULL;
CALL `my-project.test_dataset.top_names`(top_shakespeare_names);
SELECT top_shakespeare_names;
在Python中:
from google.cloud import bigquery
client = bigquery.Client()
query_string = """
DECLARE top_shakespeare_names ARRAY<STRING> DEFAULT NULL;
CALL `swast-scratch.test_dataset.top_names`(top_shakespeare_names);
SELECT top_shakespeare_names;
"""
query_job = client.query(query_string)
rows = list(query_job.result())
print(rows)
相关:如果存储过程中有SELECT语句,则可以遍历作业以获取结果,即使SELECT语句不是过程中的最后一条语句
# TODO(developer): Import the client library.
# from google.cloud import bigquery
# TODO(developer): Construct a BigQuery client object.
# client = bigquery.Client()
# Run a SQL script.
sql_script = """
-- Declare a variable to hold names as an array.
DECLARE top_names ARRAY<STRING>;
-- Build an array of the top 100 names from the year 2017.
SET top_names = (
SELECT ARRAY_AGG(name ORDER BY number DESC LIMIT 100)
FROM `bigquery-public-data.usa_names.usa_1910_2013`
WHERE year = 2000
);
-- Which names appear as words in Shakespeare's plays?
SELECT
name AS shakespeare_name
FROM UNNEST(top_names) AS name
WHERE name IN (
SELECT word
FROM `bigquery-public-data.samples.shakespeare`
);
"""
parent_job = client.query(sql_script)
# Wait for the whole script to finish.
rows_iterable = parent_job.result()
print("Script created {} child jobs.".format(parent_job.num_child_jobs))
# Fetch result rows for the final sub-job in the script.
rows = list(rows_iterable)
print("{} of the top 100 names from year 2000 also appear in Shakespeare's works.".format(len(rows)))
# Fetch jobs created by the SQL script.
child_jobs_iterable = client.list_jobs(parent_job=parent_job)
for child_job in child_jobs_iterable:
child_rows = list(child_job.result())
print("Child job with ID {} produced {} rows.".format(child_job.job_id, len(child_rows)))
我的过程中有一个select语句,但也有一个insert语句。那不行吗?我连结果都没有。有一个IF-THEN语句,它查看select语句的结果,如果select的结果为false,则插入,然后返回select@WIT,如果您的预期输出可以表示为数组或结构数组,建议使用OUT参数使输出成为过程接口的一部分。输出不是问题所在,问题是由于脚本编写,它返回2或3个结果集,我无法访问正在运行的bigquery客户端的结果,但无法访问客户端的输出。您希望捕获哪个语句的输出?从dataset.table中选择1,其中purchaseCountry=country,purchaseDate=accessDate,customerId=accessId?为什么当前的过程对您不起作用?或者只是saleExists 1/0,这与从dataset中选择1基本相同。Table更新了我的答案,这也简化了您的过程体。
# TODO(developer): Import the client library.
# from google.cloud import bigquery
# TODO(developer): Construct a BigQuery client object.
# client = bigquery.Client()
# Run a SQL script.
sql_script = """
-- Declare a variable to hold names as an array.
DECLARE top_names ARRAY<STRING>;
-- Build an array of the top 100 names from the year 2017.
SET top_names = (
SELECT ARRAY_AGG(name ORDER BY number DESC LIMIT 100)
FROM `bigquery-public-data.usa_names.usa_1910_2013`
WHERE year = 2000
);
-- Which names appear as words in Shakespeare's plays?
SELECT
name AS shakespeare_name
FROM UNNEST(top_names) AS name
WHERE name IN (
SELECT word
FROM `bigquery-public-data.samples.shakespeare`
);
"""
parent_job = client.query(sql_script)
# Wait for the whole script to finish.
rows_iterable = parent_job.result()
print("Script created {} child jobs.".format(parent_job.num_child_jobs))
# Fetch result rows for the final sub-job in the script.
rows = list(rows_iterable)
print("{} of the top 100 names from year 2000 also appear in Shakespeare's works.".format(len(rows)))
# Fetch jobs created by the SQL script.
child_jobs_iterable = client.list_jobs(parent_job=parent_job)
for child_job in child_jobs_iterable:
child_rows = list(child_job.result())
print("Child job with ID {} produced {} rows.".format(child_job.job_id, len(child_rows)))