Google bigquery Bigquery-窗口聚合同比
我试图使用一个窗口函数,为sku的每一天销售,得到sku最后365天的数量总和。如果这是每天出售,那么我可以使用行和前面的等Google bigquery Bigquery-窗口聚合同比,google-bigquery,Google Bigquery,我试图使用一个窗口函数,为sku的每一天销售,得到sku最后365天的数量总和。如果这是每天出售,那么我可以使用行和前面的等 ORDER BY CalendarFullDate ROWS BETWEEN 364 PRECEDING AND CURRENT ROW 但在这种情况下,日期分布不均匀,很多天没有销售(即我不能回到364行,假设每天都有销售) 因此,对于下面的测试/示例,是否可以使用窗口和某种类型的where子句,这样我最多只能求和364天 WITH samples AS
ORDER BY
CalendarFullDate ROWS BETWEEN 364 PRECEDING AND CURRENT ROW
但在这种情况下,日期分布不均匀,很多天没有销售(即我不能回到364行,假设每天都有销售)
因此,对于下面的测试/示例,是否可以使用窗口和某种类型的where子句,这样我最多只能求和364天
WITH samples AS (
SELECT "1" AS SKU, DATE("2018-10-27") AS CalendarFullDate, 86.0 AS DailySalesQty UNION ALL (
SELECT "1" AS SKU, DATE("2018-10-20"), 84.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-09-29"), 88.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-09-14"), 42.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-09-01"), 21.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-05-05"), 25.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-04-28"), 97.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-03-31"), 244.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-03-24"), 68.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-02-23"), 52.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-02-10"), 48.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-01-21"), 243.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-01-18"), 2.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2018-01-06"), 190.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2017-12-26"), 310.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2017-12-09"), 240.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2017-11-03"), 30.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2017-10-21"), 164.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2017-09-30"), 44.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2017-09-09"), 55.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2017-09-01"), 35.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2017-05-20"), 60.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2017-05-06"), 68.0 ) UNION ALL (
SELECT "1" AS SKU, DATE("2017-04-15"), 136.0) UNION ALL (
SELECT "2" AS SKU, DATE("2018-10-24"), 46.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-10-18"), 56.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-09-16"), 19.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-09-02"), 42.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-09-01"), 45.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-07-05"), 25.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-06-28"), 210.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-05-31"), 44.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-05-24"), 168.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-04-23"), 152.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-03-10"), 8.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-02-21"), 23.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-01-18"), 20.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2018-01-06"), 10.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2017-12-26"), 30.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2017-11-09"), 1240.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2017-11-03"), 323.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2017-10-21"), 123.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2017-09-30"), 444.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2017-09-09"), 555.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2017-08-01"), 35.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2017-06-20"), 6.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2017-05-06"), 68.0 ) UNION ALL (
SELECT "2" AS SKU, DATE("2017-04-15"), 136.0) UNION ALL (
SELECT "2" AS SKU, DATE("2017-04-09"), 136.0)
)
SELECT
SKU,
CalendarFullDate,
SUM(DailySalesQty) OVER(win)
FROM
samples WINDOW win AS (
PARTITION BY
SKU
ORDER BY
CalendarFullDate
RANGE BETWEEN DATE_TRUNC(CalendarFullDate,INTERVAL 364 DAY) AND CalendarFullDate)
我知道上面的代码不能用于范围,但它是一种伪代码,用于我真正想要做的事情。我尝试了where条款,但那是不允许的
这甚至可以使用窗口吗?这是一个很好的干净的方法,但不确定我是否可以为窗口聚合表达这样一个条件
注意:这是一个真实数据的精简版本,它有5个字段作为分区,还有20多个要聚合的度量值,是一个巨大的数据集(1 TB),因此希望它也是高效的
想法
干杯 下面是BigQuery标准SQL
#standardSQL
SELECT
SKU,
CalendarFullDate,
SUM(DailySalesQty) OVER(win) SalesQty365days
FROM (
SELECT
SKU,
CalendarFullDate,
DailySalesQty,
UNIX_DATE(CalendarFullDate) unix_days
FROM samples
)
WINDOW win AS (
PARTITION BY SKU ORDER BY unix_days
RANGE BETWEEN 364 PRECEDING AND CURRENT ROW
)
这里的诀窍是将日期类型的CalendarFullDate字段“转换”为自纪元起的整数天,以便可以按顺序使用,并且窗口表达式的范围部分用于BigQuery标准SQL
#standardSQL
SELECT
SKU,
CalendarFullDate,
SUM(DailySalesQty) OVER(win) SalesQty365days
FROM (
SELECT
SKU,
CalendarFullDate,
DailySalesQty,
UNIX_DATE(CalendarFullDate) unix_days
FROM samples
)
WINDOW win AS (
PARTITION BY SKU ORDER BY unix_days
RANGE BETWEEN 364 PRECEDING AND CURRENT ROW
)
这里的诀窍是将日期类型的CalendarFullDate字段“转换”为自纪元起的整数天,以便可以按窗口表达式的顺序和范围部分使用。再次感谢!热爱你的工作。克苏尔。“全局别名”问题的任何更新-对你有用吗?有用吗。再次感谢!热爱你的工作。克苏尔。关于“全局别名”问题的任何更新-对您有效吗?