Google bigquery 在Google BigQuery中将键值对拆分为列
我对Google BigQuery很陌生,而且肯定很挣扎 我的表格包含以下内容:Google bigquery 在Google BigQuery中将键值对拆分为列,google-bigquery,Google Bigquery,我对Google BigQuery很陌生,而且肯定很挣扎 我的表格包含以下内容: +----------+----------------------------------------+ | order_id | line_items | +----------+----------------------------------------+ | 123 | id:1|qy:1|sum:1.00;id:2|qy:6|sum:4
+----------+----------------------------------------+
| order_id | line_items |
+----------+----------------------------------------+
| 123 | id:1|qy:1|sum:1.00;id:2|qy:6|sum:4.50; |
+----------+----------------------------------------+
| 456 | id:1|qy:3|sum:3.00;id:3|qy:4|sum:3.20; |
+----------+----------------------------------------+
我需要这样看:
+----------+----+----+------+
| order_id | id | qy | sum |
+----------+----+----+------+
| 123 | 1 | 1 | 1.00 |
| 123 | 2 | 6 | 4.50 |
| 456 | 1 | 3 | 3.00 |
| 456 | 3 | 4 | 3.20 |
+----------+----+----+------+
行_项中的键值对的数量是任意的(比这3个键值对多得多,但我需要提取这三个键值对)
我能够让下面的UNNEST和SPLIT查询工作,但不幸的是我仍然有这些键值对
这个
把我带到这里:
+----------+------------+
| order_id | line_items |
+----------+------------+
| 123 | id:1 |
| 123 | qy:1 |
| 123 | sum:1.00 |
| 123 | id:2 |
| 123 | qy:6 |
| 123 | sum:4.50; |
| 456 | id:1 |
| 456 | qy:3 |
| 456 | sum:3.00 |
| 456 | id:3 |
| 456 | qy:4 |
| 456 | sum:3.20 |
+----------+------------+
所以我仍然不能真正理解,如何将这些键提取到列标题和列内容的值
如果有人给我指出正确的方向,我将不胜感激
已经非常感谢了 下面是BigQuery标准SQL
#standardSQL
select order_id,
( select split(kv, ':')[offset(1)] from x.kvs kv where split(kv, ':')[offset(0)] = 'id') id,
( select split(kv, ':')[offset(1)] from x.kvs kv where split(kv, ':')[offset(0)] = 'qy') qy,
( select split(kv, ':')[offset(1)] from x.kvs kv where split(kv, ':')[offset(0)] = 'sum') sum
from `project.dataset.table`,
unnest(split(trim(line_items, ';'), ';')) items,
unnest([struct(split(items,'|') as kvs)]) x
-- order by order_id
如果要应用于问题中的样本数据,则输出为
下面上面的变化也可能有用
#standardSQL
select order_id,
(select value from z.y where key = 'id') id,
(select value from z.y where key = 'qy') qy,
(select value from z.y where key = 'sum') sum
from `project.dataset.table`,
unnest(split(trim(line_items, ';'), ';')) items,
unnest([struct(split(items,'|') as kvs)]) x,
unnest([struct(array(
select as struct
split(kv, ':')[offset(0)] as key,
split(kv, ':')[offset(1)] value
from x.kvs kv
) as y)]) z
-- order by order_id
令人惊叹的!这非常有效。非常感谢你!我现在要试着理解你在那里做什么:D.再次感谢
#standardSQL
select order_id,
(select value from z.y where key = 'id') id,
(select value from z.y where key = 'qy') qy,
(select value from z.y where key = 'sum') sum
from `project.dataset.table`,
unnest(split(trim(line_items, ';'), ';')) items,
unnest([struct(split(items,'|') as kvs)]) x,
unnest([struct(array(
select as struct
split(kv, ':')[offset(0)] as key,
split(kv, ':')[offset(1)] value
from x.kvs kv
) as y)]) z
-- order by order_id