Sql 在表中填写缺失的余额和日期以跟踪余额

Sql 在表中填写缺失的余额和日期以跟踪余额,sql,google-bigquery,Sql,Google Bigquery,我希望你能帮我解决这个问题。我刚开始使用Bigquery编写SQL,所以我的问题看起来有点乏味。 所以我有一个表,基本上记录了日期和余额,只要余额发生变化。看起来有点像这样: +------------+-----------+------+---------+ | Date | seller_ID | Name | Balance | +------------+-----------+------+---------+ | 2020-09-10 | 1 | Jo

我希望你能帮我解决这个问题。我刚开始使用Bigquery编写SQL,所以我的问题看起来有点乏味。 所以我有一个表,基本上记录了日期和余额,只要余额发生变化。看起来有点像这样:


+------------+-----------+------+---------+
|    Date    | seller_ID | Name | Balance |
+------------+-----------+------+---------+
| 2020-09-10 |         1 | John |    10   |
| 2020-09-13 |         1 | John |    8    |
| 2020-09-15 |         1 | John |    6    |
+------------+-----------+------+---------+
然而,我需要创建一个新表,其中包含如下所示的每日余额

+------------+-----------+------+---------+
|    Date    | seller_ID | Name | Balance |
+------------+-----------+------+---------+
| 2020-09-10 |         1 | John |      10 |
| 2020-09-11 |         1 | John |      10 |
| 2020-09-12 |         1 | John |      10 |
| 2020-09-13 |         1 | John |       8 |
| 2020-09-14 |         1 | John |       8 |
| 2020-09-15 |         1 | John |       6 |
+------------+-----------+------+---------+

我尝试创建一个包含第一个日期和最后一个日期之间的所有日期的单独表,然后将原始表与其左连接,但结果表对绘制没有多大帮助。
有人知道在这种情况下该怎么办吗?

要在BigQuery中用以前的非空值填充空值,您可以使用IGNORE NULLS:

WITH test_table AS (
  SELECT DATE '2020-09-10' AS Date, 1 AS seller_Id, 'John' AS Name, 10 AS Balance UNION ALL
  SELECT '2020-09-13', 1, 'John' AS Name, 8 UNION ALL  
  SELECT '2020-09-15', 1, 'John' AS Name, 6
)
SELECT Date,
  LAST_VALUE(seller_Id IGNORE NULLS) OVER (ORDER BY Date) AS seller_Id,
  LAST_VALUE(Name IGNORE NULLS) OVER (ORDER BY Date) AS Name,    
  LAST_VALUE(Balance IGNORE NULLS) OVER (ORDER BY Date) AS purchase_date    
FROM UNNEST(GENERATE_DATE_ARRAY('2020-09-10', '2020-09-15')) AS Date
LEFT JOIN test_table USING (Date)
ORDER BY Date

要使用BigQuery中以前的非空值填充空值,可以将其与忽略空值一起使用:

WITH test_table AS (
  SELECT DATE '2020-09-10' AS Date, 1 AS seller_Id, 'John' AS Name, 10 AS Balance UNION ALL
  SELECT '2020-09-13', 1, 'John' AS Name, 8 UNION ALL  
  SELECT '2020-09-15', 1, 'John' AS Name, 6
)
SELECT Date,
  LAST_VALUE(seller_Id IGNORE NULLS) OVER (ORDER BY Date) AS seller_Id,
  LAST_VALUE(Name IGNORE NULLS) OVER (ORDER BY Date) AS Name,    
  LAST_VALUE(Balance IGNORE NULLS) OVER (ORDER BY Date) AS purchase_date    
FROM UNNEST(GENERATE_DATE_ARRAY('2020-09-10', '2020-09-15')) AS Date
LEFT JOIN test_table USING (Date)
ORDER BY Date

您可以在没有余额窗口功能的情况下完成此操作。该键仅用于日期的窗口功能:

WITH t AS (
      SELECT DATE '2020-09-10' AS Date, 1 AS seller_Id, 'John' AS Name, 10 AS Balance UNION ALL
      SELECT '2020-09-13', 1, 'John' AS Name, 8 UNION ALL  
      SELECT '2020-09-15', 1, 'John' AS Name, 6
     ),
     tt as (
      SELECT t.*, LEAD(date) OVER (PARTITION BY name ORDER BY date) as next_date
      FROM t
     )
SELECT dte, tt.name, tt.balance
FROM tt LEFT JOIN
     UNNEST(GENERATE_DATE_ARRAY(tt.date, COALESCE(DATE_ADD(tt.next_date, INTERVAL - 1 DAY), DATE '2020-09-15'))) dte
     ON true;
(注意:在这种情况下,
ON
子句是可选的。但是,我不喜欢在上没有
的连接,除非它是一个
交叉连接

与Sergey的解决方案相比,这有两个重要的优点。最重要的是,它将适用于具有不同时间段的多个名称


第二个优点是它效率更高,因为它不使用窗口函数从前面的行中获取值。

您可以在没有窗口函数的情况下进行平衡。该键仅用于日期的窗口功能:

WITH t AS (
      SELECT DATE '2020-09-10' AS Date, 1 AS seller_Id, 'John' AS Name, 10 AS Balance UNION ALL
      SELECT '2020-09-13', 1, 'John' AS Name, 8 UNION ALL  
      SELECT '2020-09-15', 1, 'John' AS Name, 6
     ),
     tt as (
      SELECT t.*, LEAD(date) OVER (PARTITION BY name ORDER BY date) as next_date
      FROM t
     )
SELECT dte, tt.name, tt.balance
FROM tt LEFT JOIN
     UNNEST(GENERATE_DATE_ARRAY(tt.date, COALESCE(DATE_ADD(tt.next_date, INTERVAL - 1 DAY), DATE '2020-09-15'))) dte
     ON true;
(注意:在这种情况下,
ON
子句是可选的。但是,我不喜欢在
上没有
的连接,除非它是一个
交叉连接

与Sergey的解决方案相比,这有两个重要的优点。最重要的是,它将适用于具有不同时间段的多个名称


第二个优点是效率更高,因为它不使用窗口函数从前面的行中获取值。

请解释解决方案中的哪些内容不正确。这不正是你想要的吗?您每天都有该客户的余额?请解释您的解决方案中有哪些不正确。这不正是你想要的吗?您每天有该客户的余额吗?