Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/sql/75.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Sql 缺少值的BIGQUERY移动平均值_Sql_Google Bigquery_Moving Average - Fatal编程技术网

Sql 缺少值的BIGQUERY移动平均值

Sql 缺少值的BIGQUERY移动平均值,sql,google-bigquery,moving-average,Sql,Google Bigquery,Moving Average,我有以下数据 with dummy_data as ( SELECT '2017-01-01' as ref_month, 18 as value, 1 as id UNION ALL SELECT '2017-02-01' as ref_month, 20 as value, 1 as id UNION ALL SELECT '2017-03-01' as ref_month, 22 as value, 1 as id -- UNION ALL SELECT '2017-04-01' as

我有以下数据

with dummy_data as 
(
SELECT '2017-01-01' as ref_month, 18 as value, 1 as id
UNION ALL SELECT '2017-02-01' as ref_month, 20 as value, 1 as id
UNION ALL SELECT '2017-03-01' as ref_month, 22 as value, 1 as id
-- UNION ALL SELECT '2017-04-01' as ref_month, 28 as value, 1 as id
UNION ALL SELECT '2017-05-01' as ref_month, 30 as value, 1 as id
UNION ALL SELECT '2017-06-01' as ref_month, 37 as value, 1 as id
UNION ALL SELECT '2017-07-01' as ref_month, 42 as value, 1 as id
-- UNION ALL SELECT '2017-08-01' as ref_month, 55 as value, 1 as id
-- UNION ALL SELECT '2017-09-01' as ref_month, 49 as value, 1 as id
UNION ALL SELECT '2017-10-01' as ref_month, 51 as value, 1 as id
UNION ALL SELECT '2017-11-01' as ref_month, 57 as value, 1 as id
UNION ALL SELECT '2017-12-01' as ref_month, 56 as value, 1 as id
UNION ALL SELECT '2017-01-01' as ref_month, 18 as value, 2 as id
UNION ALL SELECT '2017-02-01' as ref_month, 20 as value, 2 as id
UNION ALL SELECT '2017-03-01' as ref_month, 22 as value, 2 as id
UNION ALL SELECT '2017-04-01' as ref_month, 28 as value, 2 as id
-- UNION ALL SELECT '2017-05-01' as ref_month, 30 as value, 2 as id
-- UNION ALL SELECT '2017-06-01' as ref_month, 37 as value, 2 as id
UNION ALL SELECT '2017-07-01' as ref_month, 42 as value, 2 as id
UNION ALL SELECT '2017-08-01' as ref_month, 55 as value, 2 as id
UNION ALL SELECT '2017-09-01' as ref_month, 49 as value, 2 as id
-- UNION ALL SELECT '2017-10-01' as ref_month, 51 as value, 2 as id
UNION ALL SELECT '2017-11-01' as ref_month, 57 as value, 2 as id
UNION ALL SELECT '2017-12-01' as ref_month, 56 as value, 2 as id
)
我想计算每个id的移动平均值。我知道你可以做如下的事情

select 
    id
  , ref_month
  , avg(value) over (partition by id order by ref_month ROWS BETWEEN 5 PRECEDING AND CURRENT ROW ) as moving_avg
from 
    dummy_data
但正如您从我的虚拟数据中看到的,有一些缺少的值。 当缺少一些值时,如何轻松计算移动平均线? 我想先计算一个完整的日期范围

date_range AS
(
  SELECT reference_month
  FROM UNNEST(
      GENERATE_DATE_ARRAY(PARSE_DATE('%Y-%m-%d', (SELECT MIN(ref_month) FROM dummy_data)), PARSE_DATE('%Y-%m-%d', (SELECT MAX(ref_month) FROM dummy_data)), INTERVAL 1 MONTH)
  ) AS reference_month
)
然后用ID做笛卡尔积,然后用我的虚拟数据连接回来,但这似乎是一种反模式。有没有关于如何以最佳方式实现这一点的想法? 谢谢

编辑:

预期结果: 对于id 1:

2017-01-01  18
2017-02-01  19
2017-03-01  20
2017-05-01  18
2017-06-01  21.8
2017-07-01  26.2
2017-10-01  26
2017-11-01  30
2017-12-01  32.8
对于id 2:

2017-01-01  18
2017-02-01  19
2017-03-01  20
2017-04-01  22
2017-07-01  18.4
2017-08-01  25
2017-09-01  29.2
2017-11-01  40.6
2017-12-01  43.4
这应该起作用:

with dummy_data as 
(
SELECT '2017-01-01' as ref_month, 18 as value, 1 as id
UNION ALL SELECT '2017-02-01' as ref_month, 20 as value, 1 as id
UNION ALL SELECT '2017-03-01' as ref_month, 22 as value, 1 as id
-- UNION ALL SELECT '2017-04-01' as ref_month, 28 as value, 1 as id
UNION ALL SELECT '2017-05-01' as ref_month, 30 as value, 1 as id
UNION ALL SELECT '2017-06-01' as ref_month, 37 as value, 1 as id
UNION ALL SELECT '2017-07-01' as ref_month, 42 as value, 1 as id
-- UNION ALL SELECT '2017-08-01' as ref_month, 55 as value, 1 as id
-- UNION ALL SELECT '2017-09-01' as ref_month, 49 as value, 1 as id
UNION ALL SELECT '2017-10-01' as ref_month, 51 as value, 1 as id
UNION ALL SELECT '2017-11-01' as ref_month, 57 as value, 1 as id
UNION ALL SELECT '2017-12-01' as ref_month, 56 as value, 1 as id
UNION ALL SELECT '2017-01-01' as ref_month, 18 as value, 2 as id
UNION ALL SELECT '2017-02-01' as ref_month, 20 as value, 2 as id
UNION ALL SELECT '2017-03-01' as ref_month, 22 as value, 2 as id
UNION ALL SELECT '2017-04-01' as ref_month, 28 as value, 2 as id
-- UNION ALL SELECT '2017-05-01' as ref_month, 30 as value, 2 as id
-- UNION ALL SELECT '2017-06-01' as ref_month, 37 as value, 2 as id
UNION ALL SELECT '2017-07-01' as ref_month, 42 as value, 2 as id
UNION ALL SELECT '2017-08-01' as ref_month, 55 as value, 2 as id
UNION ALL SELECT '2017-09-01' as ref_month, 49 as value, 2 as id
-- UNION ALL SELECT '2017-10-01' as ref_month, 51 as value, 2 as id
UNION ALL SELECT '2017-11-01' as ref_month, 57 as value, 2 as id
UNION ALL SELECT '2017-12-01' as ref_month, 56 as value, 2 as id
)


select 
    id
  , ref_month
  , avg(avg(value)) over (partition by id order by ref_month) as moving_avg
from 
    dummy_data
    group by id
  , ref_month

如果希望将值视为0,而希望为5,则一系列滞后可能是最简单的方法:

select id, ref_month,
       (value +
        (case when lag(ref_month) over (partition by id order by ref_month) > date_add(ref_month, interval -4 month)
              then lag(value, 1) over (partition by id order by ref_month)
              else 0
         end) +
        (case when lag(ref_month, 2) over (partition by id order by ref_month) > date_add(ref_month, interval -4 month)
              then lag(value, 2) over (partition by id order by ref_month)
              else 0
         end) +
        (case when lag(ref_month, 3) over (partition by id order by ref_month) > date_add(ref_month, interval -4 month)
              then lag(value, 3) over (partition by id order by ref_month)
              else 0
         end) +
        (case when lag(ref_month, 4) over (partition by id order by ref_month) > date_add(ref_month, interval -4 month)
              then lag(value, 4) over (partition by id order by ref_month)
              else 0
         end)
       ) / 
       least(5, date_diff(min(ref_month) over (partition by id), ref_month))
from dummy_data;

查询比逻辑更复杂。它基本上是将最近的五个值除以5相加。但它会影响边界条件以及缺少的值。

下面是针对BigQuery标准SQL的,并且实际有效!:o 它假设您的ref_month是日期数据类型,如果您将其作为字符串-仍然可以-请参阅我答案底部的注释

标准SQL 选择 身份证件 参考月, 滚动六天的总价值/ 最后一个月滚动六天的位置 -首个月滚动六天的位置 + 1 作为正确的\u移动\u平均值 从…起 选择id、参考月份、值、, 日期/月份,'2016-01-01',月份/月份位置 从虚拟数据 窗口滚动六天 按id顺序按月份分区\u位置范围在前5行和当前行之间 您可以使用下面的示例数据测试/使用它

标准SQL 使用虚拟_数据作为 选择日期“2017-01-01”作为参考月,选择18作为值,选择1作为id UNION ALL选择日期“2017-02-01”作为参考月,选择20作为值,选择1作为id UNION ALL选择日期“2017-03-01”作为参考月,选择22作为值,选择1作为id -UNION ALL选择日期“2017-04-01”作为参考月,选择28作为值,选择1作为id UNION ALL选择日期“2017-05-01”作为参考月,选择30作为值,选择1作为id UNION ALL选择日期“2017-06-01”作为参考月,选择37作为值,选择1作为id UNION ALL选择日期“2017-07-01”作为参考月,选择42作为值,选择1作为id -UNION ALL选择日期“2017-08-01”作为参考月,55作为值,1作为id -UNION ALL选择日期“2017-09-01”作为参考月,49作为值,1作为id UNION ALL选择日期“2017-10-01”作为参考月,51作为值,1作为id UNION ALL选择日期“2017-11-01”作为参考月,选择57作为值,选择1作为id UNION ALL选择日期“2017-12-01”作为参考月,选择56作为值,选择1作为id UNION ALL选择日期“2017-01-01”作为参考月,18作为值,2作为id UNION ALL选择日期“2017-02-01”作为参考月,选择20作为值,选择2作为id UNION ALL选择日期“2017-03-01”作为参考月,选择22作为值,选择2作为id UNION ALL选择日期“2017-04-01”作为参考月,选择28作为值,选择2作为id -UNION ALL选择日期“2017-05-01”作为参考月,选择30作为值,选择2作为id -UNION ALL选择日期“2017-06-01”作为参考月,选择37作为值,选择2作为id UNION ALL选择日期“2017-07-01”作为参考月,选择42作为值,选择2作为id UNION ALL选择日期“2017-08-01”作为参考月,55作为值,2作为id UNION ALL选择日期“2017-09-01”作为参考月,49作为值,2作为id -UNION ALL选择日期“2017-10-01”作为参考月,51作为值,2作为id UNION ALL选择日期“2017-11-01”作为参考月,选择57作为值,选择2作为id UNION ALL选择日期“2017-12-01”作为参考月,选择56作为值,选择2作为id 选择 身份证件 参考月, 滚动六天的总价值/ 最后一个月滚动六天的位置 -首个月滚动六天的位置 + 1 作为正确的\u移动\u平均值 从…起 选择id、参考月份、值、, 日期/月份,'2016-01-01',月份/月份位置 从虚拟数据 窗口滚动六天,按id按顺序按月份分区位置范围在前5行和当前行之间 以1,2的顺序排列 为了帮助您探索逻辑-请参阅下面的上述查询的扩展版本-它甚至所有中间值都传播到非常外部的select,以便您可以查看所有内容

标准SQL 使用虚拟_数据作为 选择日期“2017-01-01”作为参考月,选择18作为值,选择1作为id UNION ALL选择日期“2017-02-01”作为参考月,选择20作为值,选择1作为id UNION ALL选择日期“2017-03-01”作为参考月,选择22作为值,选择1作为id -UNION ALL选择日期“2017-04-01”作为参考月,选择28作为值,选择1作为id UNION ALL选择日期“2017-05-01”作为参考月,选择30作为值,选择1作为id UNION ALL选择日期“2017-06-01”作为参考月,选择37作为值,选择1作为id UNION ALL选择日期“2017-07-01”作为参考月,选择42作为值,选择1作为id -UNION ALL选择日期“2017-08-01”作为参考月,55作为值,1作为id -UNION ALL选择日期“2017-09-01”作为参考月,49作为值,1作为id UNION ALL选择日期“2017-10-01”作为参考月,51作为值,1作为id UNION ALL选择日期“2017-11-01”作为参考月,选择57作为值,选择1作为id UNION ALL选择日期“2017-12-01”作为参考月,选择56作为值,选择1作为id 联合所有选择日期