Google bigquery 如何在BigQuery标准SQL中获取数组的切片?
在BigQuery中,我有一个具有如下路径列的表:Google bigquery 如何在BigQuery标准SQL中获取数组的切片?,google-bigquery,Google Bigquery,在BigQuery中,我有一个具有如下路径列的表: ID . | Path ---------+---------------------------------------- 1 | foo/bar/baz 2 | foo/bar/quux/blat 我希望能够在正斜杠上拆分路径/并选择一个或多个路径部分,重新连接它们 在PostgreSQL中,这很简单: select array_to_string((regexp_split_to_array(path
ID . | Path
---------+----------------------------------------
1 | foo/bar/baz
2 | foo/bar/quux/blat
我希望能够在正斜杠上拆分路径/并选择一个或多个路径部分,重新连接它们
在PostgreSQL中,这很简单:
select array_to_string((regexp_split_to_array(path, '/'))[1:3], '/')
但是BigQuery似乎没有任何类型的范围偏移量或数组切片函数。下面是BigQuery标准SQL
#standardSQL
SELECT id, path,
(
SELECT STRING_AGG(part, '/' ORDER BY index)
FROM UNNEST(SPLIT(path, '/')) part WITH OFFSET index
WHERE index BETWEEN 1 AND 3
) adjusted_path
FROM `project.dataset.table`
您可以使用问题中的样本数据测试、播放上述内容,如下例所示
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 id, 'foo/bar/baz/foo1/bar1/baz1/' path UNION ALL
SELECT 2, 'foo/bar/quux/blat/foo2/bar2/quux2/blat2'
)
SELECT id, path,
(
SELECT STRING_AGG(part, '/' ORDER BY index)
FROM UNNEST(SPLIT(path, '/')) part WITH OFFSET index
WHERE index BETWEEN 1 AND 3
) adjusted_path
FROM `project.dataset.table`
#standardSQL
CREATE temp FUNCTION ARRAY_SLICE(arr ARRAY<STRING>, start INT64, finish INT64)
RETURNS ARRAY<STRING> AS (
ARRAY(
SELECT part FROM UNNEST(arr) part WITH OFFSET index
WHERE index BETWEEN start AND finish ORDER BY index
)
);
SELECT id, path,
ARRAY_TO_STRING(ARRAY_SLICE(SPLIT(path, '/'), 1, 3), '/') adjusted_path
FROM `project.dataset.table`
结果
Row id path adjusted_path
1 1 foo/bar/baz/foo1/bar1/baz1/ bar/baz/foo1
2 2 foo/bar/quux/blat/foo2/bar2/quux2/blat2 bar/quux/blat
如果出于某种原因,您希望保持查询内联/类似于PostgreSQL数组中使用的查询,请参见下面的示例,将其命名为数组切片
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 id, 'foo/bar/baz/foo1/bar1/baz1/' path UNION ALL
SELECT 2, 'foo/bar/quux/blat/foo2/bar2/quux2/blat2'
)
SELECT id, path,
(
SELECT STRING_AGG(part, '/' ORDER BY index)
FROM UNNEST(SPLIT(path, '/')) part WITH OFFSET index
WHERE index BETWEEN 1 AND 3
) adjusted_path
FROM `project.dataset.table`
#standardSQL
CREATE temp FUNCTION ARRAY_SLICE(arr ARRAY<STRING>, start INT64, finish INT64)
RETURNS ARRAY<STRING> AS (
ARRAY(
SELECT part FROM UNNEST(arr) part WITH OFFSET index
WHERE index BETWEEN start AND finish ORDER BY index
)
);
SELECT id, path,
ARRAY_TO_STRING(ARRAY_SLICE(SPLIT(path, '/'), 1, 3), '/') adjusted_path
FROM `project.dataset.table`
显然,如果应用于相同的样本数据,您将得到相同的结果
#standardSQL
CREATE temp FUNCTION ARRAY_SLICE(arr ARRAY<STRING>, start INT64, finish INT64)
RETURNS ARRAY<STRING> AS (
ARRAY(
SELECT part FROM UNNEST(arr) part WITH OFFSET index
WHERE index BETWEEN start AND finish ORDER BY index
)
);
WITH `project.dataset.table` AS (
SELECT 1 id, 'foo/bar/baz/foo1/bar1/baz1/' path UNION ALL
SELECT 2, 'foo/bar/quux/blat/foo2/bar2/quux2/blat2'
)
SELECT id, path,
ARRAY_TO_STRING(ARRAY_SLICE(SPLIT(path, '/'), 1, 3), '/') adjusted_path
FROM `project.dataset.table`
Row id path adjusted_path
1 1 foo/bar/baz/foo1/bar1/baz1/ bar/baz/foo1
2 2 foo/bar/quux/blat/foo2/bar2/quux2/blat2 bar/quux/blat
完美的非常感谢。很高兴它对你有用。请考虑接受回答,如果还没有投票,请:O