Sql BigQuery优化查询,对结构字段的嵌套数组进行过滤并分组返回
我试图弄清楚如何编写GQL Google SQL查询来过滤深度嵌套的结构,然后再次嵌套,并将结构属性的第一个记录与数组保留在同一级别 我准备了一个模式示例Sql BigQuery优化查询,对结构字段的嵌套数组进行过滤并分组返回,sql,google-bigquery,gql,gqlquery,bigquery-standard-sql,Sql,Google Bigquery,Gql,Gqlquery,Bigquery Standard Sql,我试图弄清楚如何编写GQL Google SQL查询来过滤深度嵌套的结构,然后再次嵌套,并将结构属性的第一个记录与数组保留在同一级别 我准备了一个模式示例 WITH Sale AS ( SELECT "1" AS _id, STRUCT("11" AS _id, "SERVICE" AS feedbackType, DATE(TIMESTAMP("2017-01-20 14:05:51.655"))
WITH
Sale AS (
SELECT
"1" AS _id,
STRUCT("11" AS _id,
"SERVICE" AS feedbackType,
DATE(TIMESTAMP("2017-01-20 14:05:51.655")) AS createDate) AS serviceFeedback,
[STRUCT("host" AS key,
"localhost" AS value),
STRUCT("location" AS key,
"Paris" AS value)] AS tags,
TRUE AS reviewed,
[STRUCT("1" as saleId, STRUCT("101" AS _id,
"PRODUCT" AS feedbackType,
DATE(TIMESTAMP("2017-01-20 14:05:51.655")) AS createDate) AS productFeedback),
STRUCT("1" as saleId, STRUCT("102" AS _id,
"PRODUCT" AS feedbackType,
DATE(TIMESTAMP("2017-01-20 14:06:51.655")) AS createDate) AS productFeedback) ] AS saleItems,
DATE(TIMESTAMP("2017-01-20 14:05:51.655")) AS latestFeedbackDate )
过滤需要一个压平所有嵌套字段的源过滤器查询
SELECT
saleId,
serviceFeedback,
saleTags,
reviewed,
saleItems,
latestFeedbackDate
FROM (
SELECT
sale._id AS saleId,
serviceFeedback,
sale.tags AS saleTags,
reviewed,
saleItems,
latestFeedbackDate
FROM
`Sale` AS sale,
sale.saleItems AS saleItems
WHERE
reviewed = TRUE
AND serviceFeedback.createDate >= DATE(TIMESTAMP("2017-01-18 14:05:51.655"))
AND serviceFeedback._id IS NOT NULL
AND saleItems.productFeedback.createDate >= DATE(TIMESTAMP("2017-01-18 14:05:51.655")))
ORDER BY
latestFeedbackDate DESC
LIMIT
20
主要问题是,在此筛选之后,a希望按销售对所有saleItems进行分组。\u id返回初始结构并检索具有STRUCT类型的serviceFeedback字段
JSON格式的预期结果是:
{
"saleId":"1",
"serviceFeedback":{"_id":"11","feedbackType":"SERVICE","createDate":"2017-01-20"},
"saleTags":[{"key":"host","value":"localhost"},{"key":"location","value":"Paris"}],
"reviewed":"true",
"saleItems":[
{"saleId":"1","productFeedback":{"_id":"101","feedbackType":"PRODUCT","createDate":"2017-01-20"},
{"saleId":"1","productFeedback":{"_id":"102","feedbackType":"PRODUCT","createDate":"2017-01-20"},
],
"latestFeedbackDate":"2017-01-20"
}
我写下了我脑海中最简单的疑问。它产生正确的结果。但也许可以更有效地重写它
SELECT
saleId,
serviceFeedback,
latestFeedbackDate,
subQuery.saleItems as saleItems
FROM
sale
RIGHT JOIN (
SELECT
saleId,
ARRAY_AGG(saleItems) as saleItems
FROM (
SELECT
saleId,
saleItems
FROM (
SELECT
sale._id AS saleId,
latestFeedbackDate,
saleItems
FROM
`Sale` AS sale,
sale.saleItems AS saleItems
WHERE
reviewed = TRUE
AND serviceFeedback.createDate >= DATE(TIMESTAMP("2017-01-18 14:05:51.655"))
AND serviceFeedback._id IS NOT NULL
AND saleItems.productFeedback.createDate >= DATE(TIMESTAMP("2017-01-18 14:05:51.655")))
ORDER BY
latestFeedbackDate DESC)
GROUP BY
saleId
) AS subQuery
ON
sale._id = subQuery.saleId
你能给我一个更好的解决方案来达到同样的效果吗
你能给我一个更好的解决方案来达到同样的效果吗
下面生成和原始表完全相同的模式,并将所需的过滤器应用于saleItems
标准SQL
选择*替换
大堆
从UNNESTsaleItems saleItems中选择saleItems
WHERE=TRUE
和serviceFeedback.createDate>=日期时间2017-01-18 14:05:51.655
和serviceFeedback.\u id不为空
和saleItems.productFeedback.createDate>=日期时间2017-01-18 14:05:51.655
作为销售品
出售
如果您只需要字段的子集,请使用下面的示例
标准SQL
选择
_萨利德,
服务反馈,
大堆
从UNNESTsaleItems saleItems中选择saleItems
WHERE=TRUE
和serviceFeedback.createDate>=日期时间2017-01-18 14:05:51.655
和serviceFeedback.\u id不为空
和saleItems.productFeedback.createDate>=日期时间2017-01-18 14:05:51.655
作为销售品
出售