Google bigquery BigQuery:标量子查询生成多个自定义维度
我试图在我的一个联合中获得一个自定义维度,但我发现标量子查询产生了多个元素。我相信问题就在这个代码中。我正在尝试迁移到标准SQL,所以请用标准SQL给出答案Google bigquery BigQuery:标量子查询生成多个自定义维度,google-bigquery,bigquery-standard-sql,Google Bigquery,Bigquery Standard Sql,我试图在我的一个联合中获得一个自定义维度,但我发现标量子查询产生了多个元素。我相信问题就在这个代码中。我正在尝试迁移到标准SQL,所以请用标准SQL给出答案 SELECT d.value FROM UNNEST(hits) AS hits, UNNEST(hits.customDimensions) AS d WHERE d.index = 65) AS viewID, 查询的总体示例 #standardSQL SELECT date,
SELECT
d.value
FROM
UNNEST(hits) AS hits,
UNNEST(hits.customDimensions) AS d
WHERE
d.index = 65) AS viewID,
查询的总体示例
#standardSQL
SELECT
date,
channelGrouping,
viewID,
SUM(Revenue) Revenue,
SUM(Shipping) Shipping,
SUM(bounces) bounces,
SUM(transactions) transactions,
COUNT(date) sessions
FROM (
SELECT
date,
channelGrouping,
'XXXXXXXXX' AS viewID,
totals.totaltransactionrevenue / 1e6 Revenue,
(
SELECT
SUM(hits.transaction.transactionshipping) / 1e6
FROM
UNNEST(hits) hits) Shipping,
totals.bounces bounces,
totals.transactions transactions
FROM
`XXXXXXXXX.ga_sessions_*`
WHERE
_TABLE_SUFFIX BETWEEN '20170625'
AND '20170703'
UNION ALL
SELECT
date,
channelGrouping,
'XXXXXXXXX' AS viewID,
totals.totaltransactionrevenue / 1e6 Revenue,
(
SELECT
SUM(hits.transaction.transactionshipping) / 1e6
FROM
UNNEST(hits) hits) Shipping,
totals.bounces bounces,
totals.transactions transactions
FROM
`XXXXXXXXX.ga_sessions_*`
WHERE
_TABLE_SUFFIX BETWEEN '20170625'
AND '20170703'
UNION ALL
SELECT
date,
channelGrouping,
(
SELECT
d.value
FROM
UNNEST(hits) AS hits,
UNNEST(hits.customDimensions) AS d
WHERE
d.index = 65) AS viewID,
totals.totaltransactionrevenue / 1e6 Revenue,
(
SELECT
SUM(hits.transaction.transactionshipping) / 1e6
FROM
UNNEST(hits) hits) Shipping,
totals.bounces bounces,
totals.transactions transactions
FROM
`XXXXXXXXX.ga_sessions_*`
WHERE
_TABLE_SUFFIX BETWEEN '20170625'
AND '20170703'
UNION ALL
SELECT
date,
channelGrouping,
'XXXXXXXXX' AS viewID,
totals.totaltransactionrevenue / 1e6 Revenue,
(
SELECT
SUM(hits.transaction.transactionshipping) / 1e6
FROM
UNNEST(hits) hits) Shipping,
totals.bounces bounces,
totals.transactions transactions
FROM
`XXXXXXXXX.ga_sessions_*`
WHERE
_TABLE_SUFFIX BETWEEN '20170625'
AND '20170703'
UNION ALL
SELECT
date,
channelGrouping,
'XXXXXXXXX' AS viewID,
totals.totaltransactionrevenue / 1e6 Revenue,
(
SELECT
SUM(hits.transaction.transactionshipping) / 1e6
FROM
UNNEST(hits) hits) Shipping,
totals.bounces bounces,
totals.transactions transactions
FROM
`XXXXXXXXX.ga_sessions_*`
WHERE
_TABLE_SUFFIX BETWEEN '20170625'
AND '20170703' )
GROUP BY
date,
channelGrouping,
viewID
问题是,一些或所有点击都有一个索引为65的自定义维度。有几种不同的方法可以解决这个问题。可以使用数组子查询获取该索引的所有值:
ARRAY(
SELECT
d.value
FROM
UNNEST(hits) AS hits,
UNNEST(hits.customDimensions) AS d
WHERE
d.index = 65) AS viewIDs,
这将为您提供跨点击的所有视图ID,但您还需要在联合的第一个查询中使用viewID数组。另一个选项是仅从第一次点击中获取视图ID:
(
SELECT
d.value
FROM
UNNEST(hits[SAFE_OFFSET(0)].customDimensions) AS d
WHERE
d.index = 65) AS viewID
或者,如果您不关心获取哪个视图ID,您可以使用限制获取任意ID:
(
SELECT
d.value
FROM
UNNEST(hits) AS hits,
UNNEST(hits.customDimensions) AS d
WHERE
d.index = 65
LIMIT 1) AS viewID,
问题是,一些或所有点击都有一个索引为65的自定义维度。有几种不同的方法可以解决这个问题。可以使用数组子查询获取该索引的所有值:
ARRAY(
SELECT
d.value
FROM
UNNEST(hits) AS hits,
UNNEST(hits.customDimensions) AS d
WHERE
d.index = 65) AS viewIDs,
这将为您提供跨点击的所有视图ID,但您还需要在联合的第一个查询中使用viewID数组。另一个选项是仅从第一次点击中获取视图ID:
(
SELECT
d.value
FROM
UNNEST(hits[SAFE_OFFSET(0)].customDimensions) AS d
WHERE
d.index = 65) AS viewID
或者,如果您不关心获取哪个视图ID,您可以使用限制获取任意ID:
(
SELECT
d.value
FROM
UNNEST(hits) AS hits,
UNNEST(hits.customDimensions) AS d
WHERE
d.index = 65
LIMIT 1) AS viewID,
您可以在BigQuery中模拟一些数据,以便更好地了解这里发生了什么 例如,该数据模拟了
ga_会话
中的点击
模式:
WITH data AS(
select ARRAY<STRUCT<hitNumber INT64, customDimensions ARRAY<STRUCT<index INT64, value STRING>> >> [STRUCT(1 as hitNumber, [STRUCT(1 as index, 'val1' as value), STRUCT(2 as index, 'val2' as value), STRUCT(3 as index, 'val3' as value)] as customDimensions), STRUCT(2 as hitNumber, [STRUCT(1 as index, 'val1' as value)] as customDimensions)] hits
)
select * from data
您将看到结果:
因此,在查询中,您必须适应将此值作为数组返回,或者,如果index=65
中的所有值的值相同,您可以执行以下操作:
SELECT
(select custd.value from unnest(hits) hits, unnest(hits.customDimensions) custd where index = 1 limit 1)
FROM data
这只会在标量子查询中产生一个结果。您可以在BigQuery中模拟一些数据,以便更好地了解这里发生的情况
例如,该数据模拟了ga_会话
中的点击
模式:
WITH data AS(
select ARRAY<STRUCT<hitNumber INT64, customDimensions ARRAY<STRUCT<index INT64, value STRING>> >> [STRUCT(1 as hitNumber, [STRUCT(1 as index, 'val1' as value), STRUCT(2 as index, 'val2' as value), STRUCT(3 as index, 'val3' as value)] as customDimensions), STRUCT(2 as hitNumber, [STRUCT(1 as index, 'val1' as value)] as customDimensions)] hits
)
select * from data
您将看到结果:
因此,在查询中,您必须适应将此值作为数组返回,或者,如果index=65
中的所有值的值相同,您可以执行以下操作:
SELECT
(select custd.value from unnest(hits) hits, unnest(hits.customDimensions) custd where index = 1 limit 1)
FROM data
这将在标量子查询中只产生一个结果。威尔,解释得很好!感谢您提供示例数据并显示屏幕截图。作为一个小建议,请注意数组子查询不需要额外的括号。谢谢@ElliottBrossard的评论!刚刚删除了额外的括号(在我开始使用带有子查询的ARRAY\u AGG
后,我习惯于添加额外的括号),感谢您的解释,现在它对我来说更有意义了,我仍在掌握标准SQL,所以我有很多要学的解释,威尔!感谢您提供示例数据并显示屏幕截图。作为一个小建议,请注意数组子查询不需要额外的括号。谢谢@ElliottBrossard的评论!刚刚删除了额外的括号(在我开始使用带有子查询的ARRAY\u AGG
后,我习惯于添加额外的括号),感谢您的解释,现在它对我来说更有意义了,我仍在掌握标准SQL,所以我有很多东西要学