以单行形式返回分组结果的SQL查询
如果我有一个jobs表,如:以单行形式返回分组结果的SQL查询,sql,postgresql,pivot,crosstab,Sql,Postgresql,Pivot,Crosstab,如果我有一个jobs表,如: |id|created_at |status | ---------------------------- |1 |01-01-2015 |error | |2 |01-01-2015 |complete | |3 |01-01-2015 |error | |4 |01-02-2015 |complete | |5 |01-02-2015 |complete | |6 |01-03-2015 |error | |7
|id|created_at |status |
----------------------------
|1 |01-01-2015 |error |
|2 |01-01-2015 |complete |
|3 |01-01-2015 |error |
|4 |01-02-2015 |complete |
|5 |01-02-2015 |complete |
|6 |01-03-2015 |error |
|7 |01-03-2015 |on hold |
|8 |01-03-2015 |complete |
我想要一个查询,它将按日期对它们进行分组,并计算每个状态的出现次数以及该日期的总状态
SELECT created_at status, count(status), created_at
FROM jobs
GROUP BY created_at, status;
这让我
|created_at |status |count|
-------------------------------
|01-01-2015 |error |2
|01-01-2015 |complete |1
|01-02-2015 |complete |2
|01-03-2015 |error |1
|01-03-2015 |on hold |1
|01-03-2015 |complete |1
现在,我想将其压缩为在唯一日期创建的每个的一行,每个状态
都有某种多列布局。一个限制是status
是5个可能单词中的任意一个,但每个日期可能没有每个status中的一个。此外,我想为每一天的所有状态的总和。因此,预期结果如下所示:
|date |total |errors|completed|on_hold|
----------------------------------------------
|01-01-2015 |3 |2 |1 |null
|01-02-2015 |2 |null |2 |null
|01-03-2015 |3 |1 |1 |1
这些列可以从以下内容动态构建
SELECT DISTINCT status FROM jobs;
对于不包含任何该类型状态的任何一天,结果为空。我不是SQL专家,但我尝试在DB视图中执行此操作,这样我就不必在Rails中陷入多个查询的泥潭
我正在使用Postresql,但希望尽量保持它的直截了当。我试图理解聚合函数,以便使用其他一些工具,但没有成功。以下内容应适用于任何RDBMS:
SELECT created_at, count(status) AS total,
sum(case when status = 'error' then 1 end) as errors,
sum(case when status = 'complete' then 1 end) as completed,
sum(case when status = 'on hold' then 1 end) as on_hold
FROM jobs
GROUP BY created_at;
查询使用条件聚合以透视分组数据。它假定状态
值是已知的。如果有status
值的其他情况,只需添加相应的sum(case…
表达式)
在任何RDBMS中,以下各项都应起作用:
SELECT created_at, count(status) AS total,
sum(case when status = 'error' then 1 end) as errors,
sum(case when status = 'complete' then 1 end) as completed,
sum(case when status = 'on hold' then 1 end) as on_hold
FROM jobs
GROUP BY created_at;
查询使用条件聚合来透视分组数据。它假定状态
值是已知的。如果您有其他情况下的状态
值,只需添加相应的总和(case…
表达式)
实际的交叉表查询将如下所示:
SELECT * FROM crosstab(
$$SELECT created_at, status, count(*) AS ct
FROM jobs
GROUP BY 1, 2
ORDER BY 1, 2$$
,$$SELECT unnest('{error,complete,"on hold"}'::text[])$$)
AS ct (date date, errors int, completed int, on_hold int);
应该表现得很好
基本知识:
上述内容还不包括每个日期的总数。
Postgres9.5引入了该条款,该条款非常适合本案:
SELECT * FROM crosstab(
$$SELECT created_at, COALESCE(status, 'total'), ct
FROM (
SELECT created_at, status, count(*) AS ct
FROM jobs
GROUP BY created_at, ROLLUP(status)
) sub
ORDER BY 1, 2$$
,$$SELECT unnest('{total,error,complete,"on hold"}'::text[])$$)
AS ct (date date, total int, errors int, completed int, on_hold int);
在Postgres9.4之前,请改用此查询:
WITH cte AS (
SELECT created_at, status, count(*) AS ct
FROM jobs
GROUP BY 1, 2
)
TABLE cte
UNION ALL
SELECT created_at, 'total', sum(ct)
FROM cte
GROUP BY 1
ORDER BY 1
相关的:
如果您想坚持使用简单查询,这会稍微短一点:
SELECT created_at
, count(*) AS total
, count(status = 'error' OR NULL) AS errors
, count(status = 'complete' OR NULL) AS completed
, count(status = 'on hold' OR NULL) AS on_hold
FROM jobs
GROUP BY 1;
count(status)
对于每个日期的总计,很容易出错,因为它不会对状态为status
中具有空值的行进行计数。请改用count(*)
,它也会更短更快
以下是技术列表:
在Postgres9.4+中,使用新的聚合FILTER
子句:
详情:
实际的交叉表查询将如下所示:
SELECT * FROM crosstab(
$$SELECT created_at, status, count(*) AS ct
FROM jobs
GROUP BY 1, 2
ORDER BY 1, 2$$
,$$SELECT unnest('{error,complete,"on hold"}'::text[])$$)
AS ct (date date, errors int, completed int, on_hold int);
应该表现得很好
基本知识:
上述内容还不包括每个日期的总数。
Postgres9.5引入了该条款,该条款非常适合本案:
SELECT * FROM crosstab(
$$SELECT created_at, COALESCE(status, 'total'), ct
FROM (
SELECT created_at, status, count(*) AS ct
FROM jobs
GROUP BY created_at, ROLLUP(status)
) sub
ORDER BY 1, 2$$
,$$SELECT unnest('{total,error,complete,"on hold"}'::text[])$$)
AS ct (date date, total int, errors int, completed int, on_hold int);
在Postgres9.4之前,请改用此查询:
WITH cte AS (
SELECT created_at, status, count(*) AS ct
FROM jobs
GROUP BY 1, 2
)
TABLE cte
UNION ALL
SELECT created_at, 'total', sum(ct)
FROM cte
GROUP BY 1
ORDER BY 1
相关的:
如果您想坚持使用简单查询,这会稍微短一点:
SELECT created_at
, count(*) AS total
, count(status = 'error' OR NULL) AS errors
, count(status = 'complete' OR NULL) AS completed
, count(status = 'on hold' OR NULL) AS on_hold
FROM jobs
GROUP BY 1;
count(status)
对于每个日期的总计,很容易出错,因为它不会对状态为status
中具有空值的行进行计数。请改用count(*)
,它也会更短更快
以下是技术列表:
在Postgres9.4+中,使用新的聚合FILTER
子句:
详情:
如上所述,博士后。但如果有一种方法可以做到不可知,那就更好了。状态值的数量是已知的(并且是固定的)是的,状态是8个可能的词之一。那么Giorgos的回答应该符合上面提到的,Postgres。但是如果有一种方法可以做到不可知,那就更好了。状态值的数量是已知的(和固定的)吗?是的,状态是8个可能的词之一。那么Giorgos的答案应该很完美。我没有想到使用case
语句。非常简单。当系统允许我的时候,我会在大约3分钟内接受答案。(我不知道接受答案有一个“冷却”期。哈哈!)@Beartech:在Postgres 9.4中,你也可以使用filter子句,它做同样的事情,但使它更具可读性:count(*)filter(其中status='error')作为错误
Perfect。我没有想到使用case
语句。非常简单。如果系统允许,我会在大约3分钟内接受答案。(我不知道接受答案会有一个“冷却”期。哈哈!)@Beartech:在Postgres 9.4中,你也可以使用filter子句,它做同样的事情,但让它更具可读性:count(*)filter(其中status='error')as errors