Google bigquery BigQuery-仅按分隔符拆分一次
有没有一种方法可以只被分隔符拆分一次?我的数据可能在多个索引处有分隔符。我希望能把一个场分解成两个独立的场 例如,当使用句点作为分隔符时,我希望字符串how.now.brown.cow拆分为两个字段:[how,now.brown.cow]Google bigquery BigQuery-仅按分隔符拆分一次,google-bigquery,Google Bigquery,有没有一种方法可以只被分隔符拆分一次?我的数据可能在多个索引处有分隔符。我希望能把一个场分解成两个独立的场 例如,当使用句点作为分隔符时,我希望字符串how.now.brown.cow拆分为两个字段:[how,now.brown.cow] SPLIT({field},'delimiter')[SAFE_OFFSET(0)]可以很好地获得第一部分,但是我的数据中可能有不相等的数组长度,因此连接其他索引时遇到问题。下面是针对BigQuery标准SQL的 #standardSQL WITH `proj
SPLIT({field},'delimiter')[SAFE_OFFSET(0)]可以很好地获得第一部分,但是我的数据中可能有不相等的数组长度,因此连接其他索引时遇到问题。下面是针对BigQuery标准SQL的
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'how.now.brown.cow' col UNION ALL
SELECT 'how'
)
SELECT col,
SPLIT(col, '.')[OFFSET(0)] AS first_item,
( SELECT STRING_AGG(item, '.' ORDER BY OFFSET)
FROM UNNEST(SPLIT(col, '.')) item WITH OFFSET
WHERE OFFSET > 0
) AS rest_of_items
FROM `project.dataset.table`
有输出
Row col first_item rest_of_items
1 how.now.brown.cow how now.brown.cow
2 how how null
注意:以上只是其中一种方法。看起来有很多方法可以达到相同的效果-例如
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'how.now.brown.cow' col UNION ALL
SELECT 'how'
)
SELECT col,
REGEXP_EXTRACT(col, r'^([^.]*)\.?') AS first_item,
REGEXP_EXTRACT(col, r'^[^.]*\.?(.*)$') AS rest_of_items
FROM `project.dataset.table`
有输出
Row col first_item rest_of_items
1 how.now.brown.cow how now.brown.cow
2 how how
下面是BigQuery标准SQL
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'how.now.brown.cow' col UNION ALL
SELECT 'how'
)
SELECT col,
SPLIT(col, '.')[OFFSET(0)] AS first_item,
( SELECT STRING_AGG(item, '.' ORDER BY OFFSET)
FROM UNNEST(SPLIT(col, '.')) item WITH OFFSET
WHERE OFFSET > 0
) AS rest_of_items
FROM `project.dataset.table`
有输出
Row col first_item rest_of_items
1 how.now.brown.cow how now.brown.cow
2 how how null
注意:以上只是其中一种方法。看起来有很多方法可以达到相同的效果-例如
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'how.now.brown.cow' col UNION ALL
SELECT 'how'
)
SELECT col,
REGEXP_EXTRACT(col, r'^([^.]*)\.?') AS first_item,
REGEXP_EXTRACT(col, r'^[^.]*\.?(.*)$') AS rest_of_items
FROM `project.dataset.table`
有输出
Row col first_item rest_of_items
1 how.now.brown.cow how now.brown.cow
2 how how