Google bigquery BigQuery:标量子查询生成多个自定义维度

Google bigquery BigQuery:标量子查询生成多个自定义维度,google-bigquery,bigquery-standard-sql,Google Bigquery,Bigquery Standard Sql,我试图在我的一个联合中获得一个自定义维度,但我发现标量子查询产生了多个元素。我相信问题就在这个代码中。我正在尝试迁移到标准SQL,所以请用标准SQL给出答案 SELECT d.value FROM UNNEST(hits) AS hits, UNNEST(hits.customDimensions) AS d WHERE d.index = 65) AS viewID, 查询的总体示例 #standardSQL SELECT date,

我试图在我的一个联合中获得一个自定义维度,但我发现标量子查询产生了多个元素。我相信问题就在这个代码中。我正在尝试迁移到标准SQL,所以请用标准SQL给出答案

SELECT
      d.value
FROM
      UNNEST(hits) AS hits,
      UNNEST(hits.customDimensions) AS d
WHERE
      d.index = 65) AS viewID,
查询的总体示例

#standardSQL
SELECT
  date,
  channelGrouping,
  viewID,
  SUM(Revenue) Revenue,
  SUM(Shipping) Shipping,
  SUM(bounces) bounces,
  SUM(transactions) transactions,
  COUNT(date) sessions
FROM (
  SELECT
    date,
    channelGrouping,
    'XXXXXXXXX' AS viewID,
    totals.totaltransactionrevenue / 1e6 Revenue,
    (
    SELECT
      SUM(hits.transaction.transactionshipping) / 1e6
    FROM
      UNNEST(hits) hits) Shipping,
    totals.bounces bounces,
    totals.transactions transactions
  FROM
    `XXXXXXXXX.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170625'
    AND '20170703'
  UNION ALL
  SELECT
    date,
    channelGrouping,
    'XXXXXXXXX' AS viewID,
    totals.totaltransactionrevenue / 1e6 Revenue,
    (
    SELECT
      SUM(hits.transaction.transactionshipping) / 1e6
    FROM
      UNNEST(hits) hits) Shipping,
    totals.bounces bounces,
    totals.transactions transactions
  FROM
    `XXXXXXXXX.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170625'
    AND '20170703'
  UNION ALL
  SELECT
    date,
    channelGrouping,
    (
    SELECT
      d.value
    FROM
      UNNEST(hits) AS hits,
      UNNEST(hits.customDimensions) AS d
    WHERE
      d.index = 65) AS viewID,
    totals.totaltransactionrevenue / 1e6 Revenue,
    (
    SELECT
      SUM(hits.transaction.transactionshipping) / 1e6
    FROM
      UNNEST(hits) hits) Shipping,
    totals.bounces bounces,
    totals.transactions transactions
  FROM
    `XXXXXXXXX.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170625'
    AND '20170703'
  UNION ALL
  SELECT
    date,
    channelGrouping,
    'XXXXXXXXX' AS viewID,
    totals.totaltransactionrevenue / 1e6 Revenue,
    (
    SELECT
      SUM(hits.transaction.transactionshipping) / 1e6
    FROM
      UNNEST(hits) hits) Shipping,
    totals.bounces bounces,
    totals.transactions transactions
  FROM
    `XXXXXXXXX.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170625'
    AND '20170703'
  UNION ALL
  SELECT
    date,
    channelGrouping,
    'XXXXXXXXX' AS viewID,
    totals.totaltransactionrevenue / 1e6 Revenue,
    (
    SELECT
      SUM(hits.transaction.transactionshipping) / 1e6
    FROM
      UNNEST(hits) hits) Shipping,
    totals.bounces bounces,
    totals.transactions transactions
  FROM
    `XXXXXXXXX.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170625'
    AND '20170703' )
GROUP BY
  date,
  channelGrouping,
  viewID

问题是,一些或所有点击都有一个索引为65的自定义维度。有几种不同的方法可以解决这个问题。可以使用数组子查询获取该索引的所有值:

ARRAY(
  SELECT
    d.value
  FROM
    UNNEST(hits) AS hits,
    UNNEST(hits.customDimensions) AS d
  WHERE
    d.index = 65) AS viewIDs,
这将为您提供跨点击的所有视图ID,但您还需要在联合的第一个查询中使用viewID数组。另一个选项是仅从第一次点击中获取视图ID:

(
  SELECT
    d.value
  FROM
    UNNEST(hits[SAFE_OFFSET(0)].customDimensions) AS d
  WHERE
    d.index = 65) AS viewID
或者,如果您不关心获取哪个视图ID,您可以使用限制获取任意ID:

(
  SELECT
    d.value
  FROM
    UNNEST(hits) AS hits,
    UNNEST(hits.customDimensions) AS d
  WHERE
    d.index = 65
  LIMIT 1) AS viewID,

问题是,一些或所有点击都有一个索引为65的自定义维度。有几种不同的方法可以解决这个问题。可以使用数组子查询获取该索引的所有值:

ARRAY(
  SELECT
    d.value
  FROM
    UNNEST(hits) AS hits,
    UNNEST(hits.customDimensions) AS d
  WHERE
    d.index = 65) AS viewIDs,
这将为您提供跨点击的所有视图ID,但您还需要在联合的第一个查询中使用viewID数组。另一个选项是仅从第一次点击中获取视图ID:

(
  SELECT
    d.value
  FROM
    UNNEST(hits[SAFE_OFFSET(0)].customDimensions) AS d
  WHERE
    d.index = 65) AS viewID
或者,如果您不关心获取哪个视图ID,您可以使用限制获取任意ID:

(
  SELECT
    d.value
  FROM
    UNNEST(hits) AS hits,
    UNNEST(hits.customDimensions) AS d
  WHERE
    d.index = 65
  LIMIT 1) AS viewID,

您可以在BigQuery中模拟一些数据,以便更好地了解这里发生了什么

例如,该数据模拟了
ga_会话
中的
点击
模式:

WITH data AS(
  select ARRAY<STRUCT<hitNumber INT64, customDimensions ARRAY<STRUCT<index INT64, value STRING>> >> [STRUCT(1 as hitNumber, [STRUCT(1 as index, 'val1' as value), STRUCT(2 as index, 'val2' as value), STRUCT(3 as index, 'val3' as value)] as customDimensions), STRUCT(2 as hitNumber, [STRUCT(1 as index, 'val1' as value)] as customDimensions)] hits
)

select * from data
您将看到结果:

因此,在查询中,您必须适应将此值作为
数组返回,或者,如果
index=65
中的所有值的值相同,您可以执行以下操作:

SELECT
  (select custd.value from unnest(hits) hits, unnest(hits.customDimensions) custd where index = 1 limit 1)
FROM data

这只会在标量子查询中产生一个结果。

您可以在BigQuery中模拟一些数据,以便更好地了解这里发生的情况

例如,该数据模拟了
ga_会话
中的
点击
模式:

WITH data AS(
  select ARRAY<STRUCT<hitNumber INT64, customDimensions ARRAY<STRUCT<index INT64, value STRING>> >> [STRUCT(1 as hitNumber, [STRUCT(1 as index, 'val1' as value), STRUCT(2 as index, 'val2' as value), STRUCT(3 as index, 'val3' as value)] as customDimensions), STRUCT(2 as hitNumber, [STRUCT(1 as index, 'val1' as value)] as customDimensions)] hits
)

select * from data
您将看到结果:

因此,在查询中,您必须适应将此值作为
数组返回,或者,如果
index=65
中的所有值的值相同,您可以执行以下操作:

SELECT
  (select custd.value from unnest(hits) hits, unnest(hits.customDimensions) custd where index = 1 limit 1)
FROM data

这将在标量子查询中只产生一个结果。

威尔,解释得很好!感谢您提供示例数据并显示屏幕截图。作为一个小建议,请注意数组子查询不需要额外的括号。谢谢@ElliottBrossard的评论!刚刚删除了额外的括号(在我开始使用带有子查询的
ARRAY\u AGG
后,我习惯于添加额外的括号),感谢您的解释,现在它对我来说更有意义了,我仍在掌握标准SQL,所以我有很多要学的解释,威尔!感谢您提供示例数据并显示屏幕截图。作为一个小建议,请注意数组子查询不需要额外的括号。谢谢@ElliottBrossard的评论!刚刚删除了额外的括号(在我开始使用带有子查询的
ARRAY\u AGG
后,我习惯于添加额外的括号),感谢您的解释,现在它对我来说更有意义了,我仍在掌握标准SQL,所以我有很多东西要学