Hive 如果标记<&燃气轮机;如果配置单元中的标记=0,则重置为0?

Hive 如果标记<&燃气轮机;如果配置单元中的标记=0,则重置为0?,hive,conditional,lag,cumulative-sum,Hive,Conditional,Lag,Cumulative Sum,如果tag=0,如何获取运行总和并将运行总和重置为零?就像上面的样本一样。TIA您需要做的是为1和0的每个部分创建“组”。您可以通过创建一个布尔标志,然后对该列进行累加求和来获得组。在此基础上,您可以按在子查询中创建的每个组对原始标记列进行累积求和 查询: customer txn_date tag running_sum A 1-Jan-17 1 1 A 2-Jan-17 1 2 A 3-Jan-17

如果tag=0,如何获取运行总和并将运行总和重置为零?就像上面的样本一样。TIA

您需要做的是为1和0的每个部分创建“组”。您可以通过创建一个布尔标志,然后对该列进行累加求和来获得组。在此基础上,您可以按在子查询中创建的每个组对原始
标记
列进行累积求和

查询

customer    txn_date    tag running_sum
A           1-Jan-17    1   1
A           2-Jan-17    1   2
A           3-Jan-17    1   3
A           4-Jan-17    1   4
A           5-Jan-17    1   5
A           6-Jan-17    1   6
A           7-Jan-17    0   0
A           8-Jan-17    1   1
A           9-Jan-17    1   2
A           10-Jan-17   1   3
A           11-Jan-17   0   0
A           12-Jan-17   0   0
A           13-Jan-17   1   1
A           14-Jan-17   1   2
A           15-Jan-17   0   0
SELECT customer
  , txn_date
  , tag
  , SUM(tag) OVER (PARTITION BY customer, flg_sum ORDER BY txn_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_sum
FROM (
  SELECT *
    , SUM(tag_flg) OVER (PARTITION BY customer ORDER BY txn_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS flg_sum
  FROM (
    SELECT *
      , CASE WHEN  tag = 1 THEN 0 ELSE 1 END AS tag_flg
    FROM database.table ) x ) y
输出

customer    txn_date    tag running_sum
A           1-Jan-17    1   1
A           2-Jan-17    1   2
A           3-Jan-17    1   3
A           4-Jan-17    1   4
A           5-Jan-17    1   5
A           6-Jan-17    1   6
A           7-Jan-17    0   0
A           8-Jan-17    1   1
A           9-Jan-17    1   2
A           10-Jan-17   1   3
A           11-Jan-17   0   0
A           12-Jan-17   0   0
A           13-Jan-17   1   1
A           14-Jan-17   1   2
A           15-Jan-17   0   0
SELECT customer
  , txn_date
  , tag
  , SUM(tag) OVER (PARTITION BY customer, flg_sum ORDER BY txn_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_sum
FROM (
  SELECT *
    , SUM(tag_flg) OVER (PARTITION BY customer ORDER BY txn_date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS flg_sum
  FROM (
    SELECT *
      , CASE WHEN  tag = 1 THEN 0 ELSE 1 END AS tag_flg
    FROM database.table ) x ) y

您正在尝试编写SQL查询吗?如果是,您已经尝试过什么?看《谢谢你》@gobrewers14!成功了!实际上,我正在探索在这个问题上使用滞后和超前函数,但没有用。