Sql 有没有更好的方法来确定序列模式的时间间隔边界?
我有一个付款表,有正值和负值,即捕获和信用。我需要确定自上次净正金额以来,我们收到净正金额的点。例如,如果客户支付这些款项并收到这些信用证:Sql 有没有更好的方法来确定序列模式的时间间隔边界?,sql,sql-server,analysis,aggregation,Sql,Sql Server,Analysis,Aggregation,我有一个付款表,有正值和负值,即捕获和信用。我需要确定自上次净正金额以来,我们收到净正金额的点。例如,如果客户支付这些款项并收到这些信用证: 01/01 $100 <- 02/01 -$100 03/01 -$100 04/01 $100 05/01 $100 06/01 $100 <- 然后,我放弃NetCaptures为$0或更少的记录,重新计算开始日期,重新计算NetCaptures,然后重复,直到没有可删除的记录为止,保留此项 Start End
01/01 $100 <-
02/01 -$100
03/01 -$100
04/01 $100
05/01 $100
06/01 $100 <-
然后,我放弃NetCaptures为$0或更少的记录,重新计算开始日期,重新计算NetCaptures,然后重复,直到没有可删除的记录为止,保留此项
Start End NetCaptures
1900/01/01 2011/01/01 $100
2011/01/02 2011/06/01 $100
有更好的方法吗?分析表达式的一些巧妙用法?这正接近RBAR。在实践中,对于500K记录,它运行10分钟的速度是可以接受的,而在我开始用这种方式计算学分之前,它的运行速度是1.5分钟
*结果*
虽然Microsoft确实支持一个优雅的滚动总计功能,但基于这种想法,我最终得到了这样的代码:计算所有捕获,计算每个捕获的运行总计,并丢弃那些具有相等或更大运行总计的早期记录
CREATE TABLE #Sequences
(
OrderID INT NOT NULL,
Sequence INT NOT NULL,
PRIMARY KEY (OrderID, Sequence),
StartDate DATE NOT NULL DEFAULT '1900-01-01',
EndDate DATE NOT NULL,
CapturesThisPeriod DECIMAL(18, 2) NOT NULL DEFAULT 0.00,
)
INSERT INTO #Sequences (OrderID, Sequence, EndDate)
SELECT OrderID, ROW_NUMBER() OVER (PARTITION BY OrderID ORDER BY DateReceived), DateReceived
FROM Receipts
WHERE Amount > 0.00
/* Calculate the start date for each period */
UPDATE S
SET StartDate = DATEADD(D, 1, Prev.EndDate)
FROM
#Sequences AS S
INNER JOIN #Sequences AS Prev ON S.OrderID = Prev.OrderID AND Prev.Sequence = S.Sequence - 1
/* Calculate the cumulative total for each period */
UPDATE M
SET CumulativeReceipts = R.Receipts
FROM
#Sequences AS M
INNER JOIN
(
SELECT
M.OrderID, M.Sequence, SUM(R.Amount) AS Receipts
FROM
#Sequences AS M
INNER JOIN Receipts AS R ON M.OrderID = R.OrderID AND R.DateReceived <= M.EndDate
GROUP BY
M.OrderID, M.Sequence
) AS R ON M.OrderID = R.OrderID AND M.Sequence = R.Sequence
/* Delete sequences with do not represent net positive receipts */
DELETE FROM M
FROM #Sequences AS M
WHERE EXISTS (SELECT * FROM #Sequences AS Prev WHERE M.OrderID = Prev.OrderID AND Prev.Sequence < M.Sequence AND Prev.CumulativeReceipts >= M.CumulativeReceipts)
/* Recalculate sequence numbers and dates */
UPDATE S SET Sequence = NewSequence FROM (SELECT Sequence, ROW_NUMBER() OVER (PARTITION BY OrderID ORDER BY Sequence) AS NewSequence FROM #Sequences) AS S
UPDATE S
SET StartDate = DATEADD(D, 1, Prev.EndDate)
FROM
#Sequences AS S
INNER JOIN #Sequences AS Prev ON S.OrderID = Prev.OrderID AND Prev.Sequence = S.Sequence - 1
END
/* Calculate net captures per period, and continue with analysis */
搜索运行总和;例如,搜索运行总和;例如,您是说您想要曲线的局部峰值吗?当曲线上升到先前的最大值以上时?或者你对曲线何时上升到零以上感兴趣?@Andomar,我相信这正是我需要的!如果你想加上这个作为回答,我会把它标记为接受。我知道那必须是一种更简单的方法。我将在代码实现后发布代码@Lasse,根据总成绩,我在寻找零分以上的所有分数。我被困在实现单位价值上,我不认为要从累计总量开始,然后从那个里开始回溯。若微软实现按订单计算的总和,这将很容易。
CREATE TABLE #Sequences
(
OrderID INT NOT NULL,
Sequence INT NOT NULL,
PRIMARY KEY (OrderID, Sequence),
StartDate DATE NOT NULL DEFAULT '1900-01-01',
EndDate DATE NOT NULL,
CapturesThisPeriod DECIMAL(18, 2) NOT NULL DEFAULT 0.00,
)
INSERT INTO #Sequences (OrderID, Sequence, EndDate)
SELECT OrderID, ROW_NUMBER() OVER (PARTITION BY OrderID ORDER BY DateReceived), DateReceived
FROM Receipts
WHERE Amount > 0.00
/* Calculate the start date for each period */
UPDATE S
SET StartDate = DATEADD(D, 1, Prev.EndDate)
FROM
#Sequences AS S
INNER JOIN #Sequences AS Prev ON S.OrderID = Prev.OrderID AND Prev.Sequence = S.Sequence - 1
/* Calculate the cumulative total for each period */
UPDATE M
SET CumulativeReceipts = R.Receipts
FROM
#Sequences AS M
INNER JOIN
(
SELECT
M.OrderID, M.Sequence, SUM(R.Amount) AS Receipts
FROM
#Sequences AS M
INNER JOIN Receipts AS R ON M.OrderID = R.OrderID AND R.DateReceived <= M.EndDate
GROUP BY
M.OrderID, M.Sequence
) AS R ON M.OrderID = R.OrderID AND M.Sequence = R.Sequence
/* Delete sequences with do not represent net positive receipts */
DELETE FROM M
FROM #Sequences AS M
WHERE EXISTS (SELECT * FROM #Sequences AS Prev WHERE M.OrderID = Prev.OrderID AND Prev.Sequence < M.Sequence AND Prev.CumulativeReceipts >= M.CumulativeReceipts)
/* Recalculate sequence numbers and dates */
UPDATE S SET Sequence = NewSequence FROM (SELECT Sequence, ROW_NUMBER() OVER (PARTITION BY OrderID ORDER BY Sequence) AS NewSequence FROM #Sequences) AS S
UPDATE S
SET StartDate = DATEADD(D, 1, Prev.EndDate)
FROM
#Sequences AS S
INNER JOIN #Sequences AS Prev ON S.OrderID = Prev.OrderID AND Prev.Sequence = S.Sequence - 1
END
/* Calculate net captures per period, and continue with analysis */