查找sql server中截至表中每个日期的中值
我使用下面的查询查找每个部门的中位数查找sql server中截至表中每个日期的中值,sql,sql-server,tsql,median,Sql,Sql Server,Tsql,Median,我使用下面的查询查找每个部门的中位数 SELECT DISTINCT Sector, PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY Value) OVER (PARTITION BY sector) AS Median FROM TABLE 该表的格式如下 Sector Date Value A 2014-08-01 1 B 2014-08-01 5 C 2014-08-01 7
SELECT DISTINCT Sector,
PERCENTILE_DISC(0.5) WITHIN
GROUP (ORDER BY Value) OVER (PARTITION BY sector) AS Median
FROM TABLE
该表的格式如下
Sector Date Value
A 2014-08-01 1
B 2014-08-01 5
C 2014-08-01 7
A 2014-08-02 6
B 2014-08-02 5
C 2014-08-02 4
A 2014-08-03 3
B 2014-08-03 9
C 2014-08-03 6
A 2014-08-04 5
B 2014-08-04 8
C 2014-08-04 9
A 2014-08-05 5
B 2014-08-05 7
C 2014-08-05 2
因此,我得到如下预期结果
Sector Median
A 5
B 7
C 6
现在,我需要更改流程,以便计算中间值,同时只考虑给定日期之前的记录。因此,新的结果将是
Sector Date Value
A 2014-08-01 1
B 2014-08-01 5
C 2014-08-01 7 (Only 1 record each was considered for A, B and C)
A 2014-08-02 3.5
B 2014-08-02 5
C 2014-08-02 5.5 (2 records each was considered for A, B and C)
A 2014-08-03 3
B 2014-08-03 5
C 2014-08-03 6 (3 records each was considered for A, B and C)
A 2014-08-04 4
B 2014-08-04 6.5
C 2014-08-04 6.5 (4 records each was considered for A, B and C)
A 2014-08-05 5
B 2014-08-05 7
C 2014-08-05 6 (All 5 records each was considered for A, B and C)
所以这将是一种累积中值。有人能告诉我如何做到这一点吗。我的表有大约230万条记录,每个记录约1100条,记录日期约1100个
如果您需要任何信息,请告诉我。这会让事情变得更困难,因为以下内容不起作用:
SELECT DISTINCT Sector, Date,
PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY Value) OVER (PARTITION BY sector ORDER BY DATE) AS Median
FROM TABLE;
唉。为此,您可以使用交叉申请:
select t.sector, t.date, t.value, m.median
from table t cross apply
(select top 1 PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY t2.Value) OVER (PARTITION BY sector ORDER BY t2.DATE) AS Median
from table t2
where t2.sector = t.sector and t2.date <= t.date
) m;
另一种方法是创建一个三角形连接,以获取每天的所有过去值,并将其用作数据
;With T AS (
SELECT t2.Sector, t2.[Date], t1.[Value]
FROM Table1 t1
LEFT JOIN Table1 t2 ON t1.Sector = t2.Sector and t1.[Date] <= t2.[Date]
)
SELECT DISTINCT Sector
, [Date]
, PERCENTILE_CONT(0.5)
WITHIN GROUP (ORDER BY [Value])
OVER (PARTITION BY sector, [Date]) AS Median
FROM T
ORDER BY [Date], Sector;
在查询中,我已将PERCENTILE_DISC更改为PERCENTILE_CONT,以便在数值为偶数的情况下获得正确的中位数,例如第二天。尊敬的先生,谢谢。这在我的测试数据集中似乎运行良好。现在我正对着那张大桌子跑。希望一切顺利。非常感谢您的帮助。亲爱的先生,谢谢您的回答。我已将我的百分位光盘更改为百分位内容