Sql 如何为几行求一列的值之和?
我有这个表,我想为几行添加'change'列的值,或者更准确地说,从'ne'值为零的行到下一行,其中包括'ne'的零,而不是第二行本身。 如有任何答复,将不胜感激Sql 如何为几行求一列的值之和?,sql,date,group-by,gaps-and-islands,clickhouse,Sql,Date,Group By,Gaps And Islands,Clickhouse,我有这个表,我想为几行添加'change'列的值,或者更准确地说,从'ne'值为零的行到下一行,其中包括'ne'的零,而不是第二行本身。 如有任何答复,将不胜感激 ┌─rn─┬───────date─┬─ne─┬───────change─┐ │ 0 │ 2008-12-07 │ 0 │ -10330848398 │ │ 1 │ 2009-04-14 │ 1 │ -61290 │ │ 2 │ 2009-04-26 │ 1 │ 9605743360 │ │ 3 │ 2
┌─rn─┬───────date─┬─ne─┬───────change─┐
│ 0 │ 2008-12-07 │ 0 │ -10330848398 │
│ 1 │ 2009-04-14 │ 1 │ -61290 │
│ 2 │ 2009-04-26 │ 1 │ 9605743360 │
│ 3 │ 2013-07-06 │ 0 │ -32028871920 │
│ 4 │ 2014-01-12 │ 1 │ -42296164902 │
│ 5 │ 2015-06-08 │ 1 │ 59100383646 │
└────┴────────────┴────┴──────────────┘
我们期望的结果是这样的
row start end sum(change)
--------------------------------------------------
0 | 2008-12-07 | 2009-04-26 | -725,166,328
--------------------------------------------------
1 | 2013-07-06 | 2015-06-08 | -15,224,653,176
--------------------------------------------------
这是一个缺口和孤岛问题。规范的解决方案确实使用了窗口函数,而Clickhouse不支持窗口函数 下面是一种使用子查询模拟条件窗口和的方法:
select
min(date) start_date,
max(date) end_date,
sum(change) sum_change
from (
select
t.*,
(select count(*) from mytable t1 where t1.date <= t.date and t1.ne = 0) grp
from mytable t
) t
group by grp
选择ne,MINdate作为开始,MAXdate作为结束,SUMchange作为更改
按网元分组假定Clickhouse支持变量:
set @block := -1;
select
block as row,
min(date) as start,
max(date) as end,
sum(change)
from
(select
case when ne = 0 then @block:=@block+1 end as dummy,
@block as block,
t.*
from t) tt
group by block;
在bigdata>1亿行中无法解决此问题
SELECT
d[1] AS s,
d[-1] AS e,
arraySum(c) AS sm
FROM
(
SELECT
arraySplit((x, y) -> (NOT y), d, n) AS dd,
arraySplit((x, y) -> (NOT y), c, n) AS cc
FROM
(
SELECT
groupArray(date) AS d,
groupArray(ne) AS n,
groupArray(change) AS c
FROM
(
SELECT *
FROM mytable
ORDER BY rn ASC
)
)
)
ARRAY JOIN
dd AS d,
cc AS c
┌─s──────────┬─e──────────┬───────────sm─┐
│ 2008-12-07 │ 2009-04-26 │ -725166328 │
│ 2013-07-06 │ 2015-06-08 │ -15224653176 │
└────────────┴────────────┴──────────────┘
只是解决此任务的另一种方法: 使用SELECT arraySortgroupArrayrn 从测试表 其中ne=0作为组\u开始\u id 选择argMindate、rn start、argMaxdate、rn end、sumchange 从…起 选择注册号、日期、更改 从测试表 rn订购 按阵列分组FIRSTINDEXX->rn
SELECT
d[1] AS s,
d[-1] AS e,
arraySum(c) AS sm
FROM
(
SELECT
arraySplit((x, y) -> (NOT y), d, n) AS dd,
arraySplit((x, y) -> (NOT y), c, n) AS cc
FROM
(
SELECT
groupArray(date) AS d,
groupArray(ne) AS n,
groupArray(change) AS c
FROM
(
SELECT *
FROM mytable
ORDER BY rn ASC
)
)
)
ARRAY JOIN
dd AS d,
cc AS c
┌─s──────────┬─e──────────┬───────────sm─┐
│ 2008-12-07 │ 2009-04-26 │ -725166328 │
│ 2013-07-06 │ 2015-06-08 │ -15224653176 │
└────────────┴────────────┴──────────────┘