Mysql 在特定日期,使用sql/PHP计算两个不同行/列中两个数字的平均值、方差和标准偏差
我有一个具有以下结构的数据库:Mysql 在特定日期,使用sql/PHP计算两个不同行/列中两个数字的平均值、方差和标准偏差,mysql,sql,sqlite,date,variance,Mysql,Sql,Sqlite,Date,Variance,我有一个具有以下结构的数据库: rowid ID startTimestamp endTimestamp subject 1 00:50:c2:63:10:1a ...1000 ...1090 entrance 2 00:50:c2:63:10:1a ...1100 ...1270 entrance 3 00:5
rowid ID startTimestamp endTimestamp subject
1 00:50:c2:63:10:1a ...1000 ...1090 entrance
2 00:50:c2:63:10:1a ...1100 ...1270 entrance
3 00:50:c2:63:10:1a ...1300 ...1310 door1
4 00:50:c2:63:10:1a ...1370 ...1400 entrance
.
.
.
通过此SQL查询,我可以获得一行和下一行之间的endTime和startTime之间的平均差异,按主题和ID排序,以及它们的最小值、最大值、方差和标准偏差:
SELECT ID,AVG(diff) AS average,
AVG(diff*diff) - AVG(diff)*AVG(diff) AS variance,
SQRT(AVG(diff*diff) - AVG(diff)*AVG(diff)) AS stdev,
MIN(diff) AS minTime,
MAX(diff) AS maxTime
FROM
(SELECT t1.id, t1.endTimestamp,
min(t2.startTimeStamp) - t1.endTimestamp AS diff
FROM table1 t1
INNER JOIN table1 t2
ON t2.ID = t1.ID AND t2.subject = t1.subject
AND t2.startTimestamp > t1.startTimestamp -- consider only later startTimestamps
WHERE t1.subject = 'entrance'
GROUP BY t1.id, t1.endTimestamp) AS diffs
GROUP BY ID
这很好,如果我在同一天只有几行,时差较小,您可以在这个sqlfiddle中看到它:
但是,当我在不同的一天有额外的数据时,我得到了不好的值:
因此,我想计算每一天的平均值、最小值、最大值、方差、标准差
我知道MySQL有日期函数,但我无法完成…有人能帮我吗?或者我必须编写一段PHP代码来处理这个问题吗?是否与通过将日期添加到
组一样简单。以下是MySQL和SQLite中都应该使用的语法,以结束时间为基础,并假设结束时间存储为日期时间:
SELECT ID, thedate, AVG(diff) AS average,
AVG(diff*diff) - AVG(diff)*AVG(diff) AS variance,
SQRT(AVG(diff*diff) - AVG(diff)*AVG(diff)) AS stdev,
MIN(diff) AS minTime,
MAX(diff) AS maxTime
FROM (SELECT t1.id, t1.endTimestamp, DATE(endtimestamp) as thedate,
min(t2.startTimeStamp) - t1.endTimestamp AS diff
FROM table1 t1 INNER JOIN
table1 t2
ON t2.ID = t1.ID AND t2.subject = t1.subject AND
t2.startTimestamp > t1.startTimestamp -- consider only later startTimestamps
WHERE t1.subject = 'entrance'
GROUP BY t1.id, t1.endTimestamp
) AS diffs
GROUP BY ID, thedate
如果存储为时间戳,请参阅Marty的评论。鉴于MySQL具有用于标准偏差和方差的函数,有没有理由不使用它们?是MySQL还是SQLite,还是两者都有?我更喜欢SQLite中的解决方案……但如果我尝试切换,SQLiteFIDLE总是返回错误……如果使用DATE(FROM_UNIXTIME(endtimestamp))按日期
运行。@MartyMcVry。谢谢你的澄清。嗨,我又做了一把小提琴:看起来很不错……我会在晚上检查这些数值,然后再写一次……@MichaelMeier。你可能想过滤掉从一天开始到另一天结束的任何东西。如果您知道合理性阈值,那么只需添加一个where
子句,如where diff<100000
或其他任何内容。@MichaelMeier。我不知道你所说的“差异量”是什么意思。三个概念是sum(diff)
,sum(abs(diff))
和计数(distinct diff)
。