Sql BigQuery优化查询,对结构字段的嵌套数组进行过滤并分组返回

Sql BigQuery优化查询,对结构字段的嵌套数组进行过滤并分组返回,sql,google-bigquery,gql,gqlquery,bigquery-standard-sql,Sql,Google Bigquery,Gql,Gqlquery,Bigquery Standard Sql,我试图弄清楚如何编写GQL Google SQL查询来过滤深度嵌套的结构,然后再次嵌套,并将结构属性的第一个记录与数组保留在同一级别 我准备了一个模式示例 WITH Sale AS ( SELECT "1" AS _id, STRUCT("11" AS _id, "SERVICE" AS feedbackType, DATE(TIMESTAMP("2017-01-20 14:05:51.655"))

我试图弄清楚如何编写GQL Google SQL查询来过滤深度嵌套的结构,然后再次嵌套,并将结构属性的第一个记录与数组保留在同一级别

我准备了一个模式示例

 WITH
      Sale AS (
      SELECT
        "1" AS _id,
        STRUCT("11" AS _id,
          "SERVICE" AS feedbackType,
          DATE(TIMESTAMP("2017-01-20 14:05:51.655")) AS createDate) AS serviceFeedback,
        [STRUCT("host" AS key,
          "localhost" AS value),
        STRUCT("location" AS key,
          "Paris" AS value)] AS tags,
        TRUE AS reviewed,
        [STRUCT("1" as saleId, STRUCT("101" AS _id,
            "PRODUCT" AS feedbackType,
            DATE(TIMESTAMP("2017-01-20 14:05:51.655")) AS createDate) AS productFeedback),
        STRUCT("1" as saleId, STRUCT("102" AS _id,
            "PRODUCT" AS feedbackType,
            DATE(TIMESTAMP("2017-01-20 14:06:51.655")) AS createDate) AS productFeedback) ] AS saleItems,
        DATE(TIMESTAMP("2017-01-20 14:05:51.655")) AS latestFeedbackDate )
过滤需要一个压平所有嵌套字段的源过滤器查询

SELECT
  saleId,
  serviceFeedback,
  saleTags,
  reviewed,
  saleItems,
  latestFeedbackDate
FROM (
  SELECT
    sale._id AS saleId,
    serviceFeedback,
    sale.tags AS saleTags,
    reviewed,
    saleItems,
    latestFeedbackDate
  FROM
    `Sale` AS sale,
    sale.saleItems AS saleItems
  WHERE
    reviewed = TRUE
    AND serviceFeedback.createDate >= DATE(TIMESTAMP("2017-01-18 14:05:51.655"))
    AND serviceFeedback._id IS NOT NULL
    AND saleItems.productFeedback.createDate >= DATE(TIMESTAMP("2017-01-18 14:05:51.655")))
ORDER BY
  latestFeedbackDate DESC
LIMIT
  20
主要问题是,在此筛选之后,a希望按销售对所有saleItems进行分组。\u id返回初始结构并检索具有STRUCT类型的serviceFeedback字段

JSON格式的预期结果是:

{
    "saleId":"1",
    "serviceFeedback":{"_id":"11","feedbackType":"SERVICE","createDate":"2017-01-20"},
    "saleTags":[{"key":"host","value":"localhost"},{"key":"location","value":"Paris"}],
    "reviewed":"true",
    "saleItems":[
        {"saleId":"1","productFeedback":{"_id":"101","feedbackType":"PRODUCT","createDate":"2017-01-20"},
        {"saleId":"1","productFeedback":{"_id":"102","feedbackType":"PRODUCT","createDate":"2017-01-20"},
    ],
    "latestFeedbackDate":"2017-01-20"
}
我写下了我脑海中最简单的疑问。它产生正确的结果。但也许可以更有效地重写它

SELECT
  saleId,
  serviceFeedback,
  latestFeedbackDate,
  subQuery.saleItems as saleItems
FROM
  sale
RIGHT JOIN (
  SELECT
    saleId,
    ARRAY_AGG(saleItems) as saleItems
  FROM (
    SELECT
      saleId,
      saleItems
    FROM (
      SELECT
        sale._id AS saleId,
        latestFeedbackDate,
        saleItems
      FROM
        `Sale` AS sale,
        sale.saleItems AS saleItems
      WHERE
        reviewed = TRUE
        AND serviceFeedback.createDate >= DATE(TIMESTAMP("2017-01-18 14:05:51.655"))
        AND serviceFeedback._id IS NOT NULL
        AND saleItems.productFeedback.createDate >= DATE(TIMESTAMP("2017-01-18 14:05:51.655")))
    ORDER BY
      latestFeedbackDate DESC)
  GROUP BY
    saleId
    ) AS subQuery
ON
  sale._id = subQuery.saleId
你能给我一个更好的解决方案来达到同样的效果吗

你能给我一个更好的解决方案来达到同样的效果吗

下面生成和原始表完全相同的模式,并将所需的过滤器应用于saleItems

标准SQL 选择*替换 大堆 从UNNESTsaleItems saleItems中选择saleItems WHERE=TRUE 和serviceFeedback.createDate>=日期时间2017-01-18 14:05:51.655 和serviceFeedback.\u id不为空 和saleItems.productFeedback.createDate>=日期时间2017-01-18 14:05:51.655 作为销售品 出售 如果您只需要字段的子集,请使用下面的示例

标准SQL 选择 _萨利德, 服务反馈, 大堆 从UNNESTsaleItems saleItems中选择saleItems WHERE=TRUE 和serviceFeedback.createDate>=日期时间2017-01-18 14:05:51.655 和serviceFeedback.\u id不为空 和saleItems.productFeedback.createDate>=日期时间2017-01-18 14:05:51.655 作为销售品 出售