Sql 通过BigQuery中的值获取记录类型中的最后一个匹配值
我在BigQuery中有一个如下所示的数据结构:Sql 通过BigQuery中的值获取记录类型中的最后一个匹配值,sql,google-bigquery,Sql,Google Bigquery,我在BigQuery中有一个如下所示的数据结构: [{ sessionID: '123456', revenue: 100.00, pagesViewed: [ {hit: 1, val: "a.html"}, {hit:3, val: "b.html"}, {hit:3, val: "c.html?test=AAC"}, {hit:10, val:"d.html?test=CCC"} ] }, { sessionID: '5555',
[{
sessionID: '123456',
revenue: 100.00,
pagesViewed: [
{hit: 1, val: "a.html"}, {hit:3, val: "b.html"}, {hit:3, val: "c.html?test=AAC"}, {hit:10, val:"d.html?test=CCC"}
]
},
{
sessionID: '5555',
revenue: 50.00,
pagesViewed: [
{hit: 1, val: "a.html"}, {hit:3, val: "b.html?test=123"}, {hit:9, val: "c.html"}, {hit:14, val:"d.html"}
]
}]
我正在尝试获取每个会话的最后一个测试ID。对于会话A,最后一个测试ID将等于:CCC。对于会话B,它应该等于123。从那里,我试图通过最终测试值得到收入的总和
我尝试过的查询是:
SELECT
REGEXP_EXTRACT(mnt,r'\?test\=([^&]*)') as TestId,
SUM(rev) as Revenue
FROM (
SELECT
sessionID,
MAX(CONCAT(CAST(pagesViewed.hit AS string),pagePagesViewed.val)) AS mnt,
MAX(revenue) AS rev
FROM
`table` AS m,
UNNEST(m.pagesViewed) AS pagesViewed
WHERE
pagesViewed.val LIKE "%test=%"
GROUP BY
1
ORDER BY
1,
2 ASC)
GROUP BY
1
ORDER BY
2 DESC
但是,输出与上面的预期值不匹配。任何帮助都将不胜感激
输出:
Row TestId Revenue
1 AAC 100.0
2 123 50.0
期望
Row TestId Revenue
1 CCC 100.0
2 123 50.0
这应该适用于您的目的:
SELECT
(SELECT
ARRAY_AGG(
REGEXP_EXTRACT(pageViewed.val,r'\?test\=([^&]*)')
IGNORE NULLS ORDER BY pageViewed.hit DESC LIMIT 1)[OFFSET(0)]
FROM UNNEST(pagesViewed) AS pageViewed
) AS TestId,
SUM(revenue) AS Revenue
FROM `project.dataset.table`
GROUP BY 1
ORDER BY 2 DESC;
它返回数组中最后一个匹配的“test”值。您可以在示例数据上进行尝试:
WITH `project.dataset.table` AS (
SELECT '123456' AS sessionId, 100.00 AS revenue, ARRAY<STRUCT<hit INT64, val STRING>>[(1, 'a.html'), (2, 'b.html'), (3, 'c.html?test=AAC'), (4, 'd.html?test=CCC')] AS pagesViewed UNION ALL
SELECT '5555', 50.00, ARRAY<STRUCT<hit INT64, val STRING>>[(1, 'a.html'), (2, 'b.html?test=123'), (3, 'c.html'), (4, 'd.html')]
)
SELECT
(SELECT
ARRAY_AGG(
REGEXP_EXTRACT(pageViewed.val,r'\?test\=([^&]*)')
IGNORE NULLS ORDER BY pageViewed.hit DESC LIMIT 1)[OFFSET(0)]
FROM UNNEST(pagesViewed) AS pageViewed
) AS TestId,
SUM(revenue) AS Revenue
FROM `project.dataset.table`
GROUP BY 1
ORDER BY 2 DESC;
这将一行中的CCC100.0和另一行中的12350.0作为输出