多个列和空值中每行的SQL平均值
我有一个记录传感器数据的应用程序,我希望能够从多个传感器中得出平均值,可以是一个、两个、三个或更多 编辑:这些是温度传感器,因此0是传感器可能存储在数据库中的值 我最初的出发点是以下SQL查询:多个列和空值中每行的SQL平均值,sql,postgresql,Sql,Postgresql,我有一个记录传感器数据的应用程序,我希望能够从多个传感器中得出平均值,可以是一个、两个、三个或更多 编辑:这些是温度传感器,因此0是传感器可能存储在数据库中的值 我最初的出发点是以下SQL查询: SELECT grid.t5||'.000000' as ts, avg(t.sensorvalue) sensorvalue1 , avg(w.sensorvalue)AS sensorvalue2 FROM (SELECT generate_series(min(date_trunc('hour
SELECT grid.t5||'.000000' as ts,
avg(t.sensorvalue) sensorvalue1
, avg(w.sensorvalue)AS sensorvalue2
FROM
(SELECT generate_series(min(date_trunc('hour', ts))
,max(ts), interval '5 min') AS t5 FROM device_history_20865735 where
ts between '2015/05/13 09:00' and '2015/05/14 09:00' ) grid
LEFT JOIN device_history_20865735 t ON t.ts >= grid.t5 AND t.ts < grid.t5 + interval '5 min'
LEFT JOIN device_history_493417852 w ON w.ts >= grid.t5 AND w.ts < grid.t5 + interval '5 min'
--WHERE t.sensorvalue notnull
GROUP BY grid.t5 ORDER BY grid.t5
我的目标是计算所有可用传感器每5分钟间隔的平均值,因此空值是一个问题,我考虑使用CASE语句,因此如果存在空值,则获取其他传感器的值
SELECT grid.t5||'.000000' as ts,
CASE
WHEN avg(t.sensorvalue) ISNULL THEN avg(w.sensorvalue)
ELSE avg(t.sensorvalue)
END AS sensorvalue
,
CASE
WHEN avg(w.sensorvalue) ISNULL THEN avg(t.sensorvalue)
ELSE avg(w.sensorvalue)
END AS sensorvalue2
FROM
(SELECT generate_series(min(date_trunc('hour', ts)),max(ts), interval '5 min') AS t5
FROM device_history_20865735 where
ts between '2015/05/13 09:00' and '2015/05/14 09:00' ) grid
LEFT JOIN device_history_20865735 t ON t.ts >= grid.t5 AND t.ts < grid.t5 + interval '5 min'
LEFT JOIN device_history_493417852 w ON w.ts >= grid.t5 AND w.ts < grid.t5 + interval '5 min'
GROUP BY grid.t5 ORDER BY grid.t5
但是为了计算平均值,我必须在这上面做另一个选择,并且每列数选择一个传感器,如果它们只有两个,这是可以的,但是如果有3个或4个传感器,这会变得非常混乱,因为可能有多个传感器每行都有空值
SQL是从使用postgres 9.4的Python应用程序语法上派生出来的,所以有没有一种简单的方法来实现我所需要的,因为我觉得我正在走一条相当复杂的道路
编辑2:根据您的输入,我已经生成了这段SQL代码,同样,它看起来相当复杂,但如果它可靠且可维护,则可以接受您的想法和审查:
SELECT ts, sensortotal, sensorcount,
CASE
WHEN sensorcount = 0 THEN -1000
ELSE sensortotal/sensorcount
END AS sensorAvg
FROM (
WITH grid as (
SELECT t5
FROM (SELECT generate_series(min(date_trunc('hour', ts)), max(ts), interval '5 min') as t5
FROM device_history_20865735
) d
WHERE t5 between '2015-05-13 09:00' and '2015-05-14 09:00'
)
SELECT d1.t5 || '.000000' as ts
, Coalesce(avg(d1.sensorvalue), 0) + Coalesce(avg(d2.sensorvalue),0) as sensorTotal
, (CASE
WHEN avg(d1.sensorvalue) ISNULL THEN 0
ELSE 1
END + CASE
WHEN avg(d2.sensorvalue) ISNULL THEN 0
ELSE 1
END) as sensorCount
FROM (SELECT grid.t5, avg(t.sensorvalue) as sensorvalue
FROM grid LEFT JOIN
device_history_20865735 t
ON t.ts >= grid.t5 AND t.ts <grid.t5 + interval '5 min'
GROUP BY grid.t5
) d1 LEFT JOIN
(SELECT grid.t5, avg(t.sensorvalue) as sensorvalue
FROM grid LEFT JOIN
device_history_493417852 t
ON t.ts >= grid.t5 AND t.ts <grid.t5 + interval '5 min'
GROUP BY grid.t5
) d2 on d1.t5 = d2.t5
GROUP BY d1.t5
ORDER BY d1.t5
) tmp;
谢谢 要获得准确的平均值,您需要在连接之前分别计算每个平均值:
WITH grid as (
SELECT t5
FROM (SELECT generate_series(min(date_trunc('hour', ts)), max(ts), interval '5 min') as t5
FROM device_history_20865735
) d
WHERE t5 between '2015-05-13 09:00' and '2015-05-14 09:00'
)
SELECT d1.t5 || '.000000' as ts,
avg(d1.sensorvalue) as sensorvalue1
, avg(d2.sensorvalue) as sensorvalue2
FROM (SELECT grid.t5, avg(t.sensorvalue) as sensorvalue
FROM grid LEFT JOIN
device_history_20865735 t
ON t.ts >= grid.t5 AND t.ts <grid.t5 + interval '5 min'
GROUP BY grid.t5
) d1 LEFT JOIN
(SELECT grid.t5, avg(t.sensorvalue) as sensorvalue
FROM grid LEFT JOIN
device_history_493417852 t
ON t.ts >= grid.t5 AND t.ts <grid.t5 + interval '5 min'
GROUP BY grid.t5
) d2 on d1.t5 = d2.t5
GROUP BY d1.t5
ORDER BY d1.t5;
听起来你想做这样的事情:
(coalesce(value1,0) + coalesce(value2,0) + coalesce(value3,0)) /
(value1 IS NOT NULL::int + value2 IS NOT NULL::int + value3 IS NOT NULL::int)
AS average
基本上,只做你想为每一行做的数学题。唯一棘手的部分是如何计算非空值—我使用了强制转换,但还有其他选项,例如:
CASE WHEN value1 IS NULL THEN 0 ELSE 1 END
我不知道您需要如何计算平均值,但您可以执行coalescusmt.sensorvalue、0+coalescusmw.sensorvalue、0/countt.sensorvalue+countw.sensorvalue。这可以很容易地扩展到任何数量的传感器。谢谢@dnoeth!我需要按网格的每行计算一次,例如每5分钟一次,而不是整个列……谢谢@Gordon!-我得到一个语法错误,虽然。。。错误:第21行或其附近出现语法错误:GROUP BY d1.t5^左侧联接中的d1和d2之间没有关系,ON条件为缺失。我已设法使其运行,但结果与顶部的SQL相同。。。关于平均数问题,是否有更“优雅”的解决方案
CASE WHEN value1 IS NULL THEN 0 ELSE 1 END