在SQL Server 2008 R2中使用MAX()和ROW_NUMBER()消除重复数据
我有如下数据:在SQL Server 2008 R2中使用MAX()和ROW_NUMBER()消除重复数据,sql,sql-server-2008-r2,max,row-number,Sql,Sql Server 2008 R2,Max,Row Number,我有如下数据: B_Pt_No | MRN | B_Adm_Date | B_Dsch_Date | B_Days_Stay | B_Days_To_Readmit 123 | 1234 | 4/3/2013 | 4/10/2013 | 7 | 30 123 | 1234 | 4/3/2013 | 4/10/2013 | 7 | 30 125 | 1229 | 4/9/2013 | 4/22/
B_Pt_No | MRN | B_Adm_Date | B_Dsch_Date | B_Days_Stay | B_Days_To_Readmit
123 | 1234 | 4/3/2013 | 4/10/2013 | 7 | 30
123 | 1234 | 4/3/2013 | 4/10/2013 | 7 | 30
125 | 1229 | 4/9/2013 | 4/22/2013 | 13 | 23
我使用以下查询获取数据:
-- VARIABLE INITIALIZATION AND DECLARATION
DECLARE @SD AS DATE;
DECLARE @ED AS DATE;
SET @SD = '2011-01-01';
SET @ED = '2013-12-31';
-- COLUMN SELECTION
SELECT DISTINCT B_Pt_No AS [READMIT ENCOUNTER]
, B_Med_Rec_No AS [MRN]
, B_Adm_Src_Desc AS [READMIT SOURCE]
, CAST(B_Adm_Date AS DATE) AS [READMIT DATE]
, CAST(B_Dsch_Date AS DATE) AS [READMIT DISC DATE]
, DATEPART(MONTH, B_Dsch_Date) AS [READMIT MONTH]
, DATEPART(YEAR, B_Dsch_Date) AS [READMIT YEAR]
, B_Days_Stay AS [LOS]
, MAX(B_Days_To_Readmit) OVER (PARTITION BY B_PT_NO) AS [INTERIM] <-- trouble
, CASE
WHEN B_Pyr1_Co_Plan_Cd = '*'
THEN 'SELF PAY'
ELSE B_Pyr1_Co_Plan_Cd
END AS [READMIT INSURANCE]
, B_Mdc_Name AS [READMIT MDC]
, B_Drg_No AS [READMIT DRG]
, B_Clasf_Desc AS [READMIT DX CLASF]
, B_Readm_Adm_Dr_Name AS [READMIT ADMITTING DR]
, B_Readm_Atn_Dr_Name AS [READMIT ATTENDING DR]
, B_Hosp_Svc AS [READMIT HOSP SVC]
-- DB USED
FROM smsdss.c_readmissions_v
-- FILTER(S) USED
/*
THE FIRST FILTER IS STATING THAT WE ONLY WANT MRN'S THAT HAVE HAD A
DRG FROM A CERTAIN GROUP.
*/
WHERE B_Med_Rec_No IN (
SELECT DISTINCT MED_REC_NO
FROM smsdss.BMH_PLM_PtAcct_V
WHERE Plm_Pt_Acct_Type = 'I'
AND PtNo_Num < '20000000'
AND Dsch_Date BETWEEN @SD AND @ED
AND drg_no IN ( -- DRG'S OF INTEREST
'190','191','192' -- COPD
,'291','292','293' -- CHF
,'193','194','195' -- PN
)
)
AND B_Dsch_Date BETWEEN @SD AND @ED
AND B_Adm_Src_Desc != 'Scheduled Admission'
AND B_Pt_No < '20000000'
到目前为止,查询运行得非常好,只有两个重复项,与示例数据中提到的重复项相同。我使用DISTINCT作为访问ID,并使用MAXB_DAYS_通过B_PT_NO分区重新提交,以获得访问间隔的最大天数。我遇到的问题是,中间列在两行中具有相同的值。是否可以使用MAX和ROW_NUMBER的某种组合来获得其中一个
谢谢,您是否只是想消除重复,并记录每次就诊的最新重新入院日期?你能在这里增加一些一致性吗?示例数据显示了访问ID和重新提交日期,但查询引用了B_PT_NO和B_Days_To_READMIT。你能不能也揍一下给这些栏目命名的人?最后,请不要使用“单引号”来分隔别名-这种语法在某些形式中是不推荐使用的,它还使您的别名看起来像字符串值。请在必要时使用[方括号],或者不要使用需要转义的别名。是的,我正在尝试消除重复项,并为每个访问ID获取最新的重新提交日期,我将解决一致性问题。我不知道单引号已被弃用,那么我将坚持使用括号。@AaronBertrand+1表示对上述人员的批评。您是否只是试图消除重复,并为每次访问获取最新的重新提交日期?你能在这里增加一些一致性吗?示例数据显示了访问ID和重新提交日期,但查询引用了B_PT_NO和B_Days_To_READMIT。你能不能也揍一下给这些栏目命名的人?最后,请不要使用“单引号”来分隔别名-这种语法在某些形式中是不推荐使用的,它还使您的别名看起来像字符串值。请在必要时使用[方括号],或者不要使用需要转义的别名。是的,我正在尝试消除重复项,并为每个访问ID获取最新的重新提交日期,我将解决一致性问题。我不知道单引号被弃用了,那么我就坚持用括号。@AaronBertrand+1,感谢你对上述人员的帮助。谢谢你的帮助,Aaron。谢谢你的帮助Aaron。
;WITH cte AS
(
SELECT B_Pt_No AS [READMIT ENCOUNTER]
, B_Med_Rec_No AS MRN
, B_Adm_Src_Desc AS [READMIT SOURCE]
, CAST(B_Adm_Date AS DATE) AS [READMIT DATE]
, CAST(B_Dsch_Date AS DATE) AS [READMIT DISC DATE]
, DATEPART(MONTH, B_Dsch_Date) AS [READMIT MONTH]
, DATEPART(YEAR, B_Dsch_Date) AS [READMIT YEAR]
, B_Days_Stay AS LOS
, B_Days_To_Readmit AS INTERIM
, CASE WHEN B_Pyr1_Co_Plan_Cd = '*' THEN 'SELF PAY'
ELSE B_Pyr1_Co_Plan_Cd END AS [READMIT INSURANCE]
, B_Mdc_Name AS [READMIT MDC]
, B_Drg_No AS [READMIT DRG]
, B_Clasf_Desc AS [READMIT DX CLASF]
, B_Readm_Adm_Dr_Name AS [READMIT ADMITTING DR]
, B_Readm_Atn_Dr_Name AS [READMIT ATTENDING DR]
, B_Hosp_Svc AS [READMIT HOSP SVC]
, rn = ROW_NUMBER() OVER (PARTITION BY B_Pt_No ORDER BY B_Adm_Date DESC)
FROM smsdss.c_readmissions_v AS r
WHERE EXISTS
(
SELECT 1 FROM smsdss.BMH_PLM_PtAcct_V
WHERE Plm_Pt_Acct_Type = 'I'
AND PtNo_Num < '20000000'
AND Dsch_Date BETWEEN @SD AND @ED
AND drg_no IN ('190','191','192' -- COPD
,'291','292','293' -- CHF
,'193','194','195' -- PN
) AND MED_REC_NO = r.B_Med_Rec_No
)
AND B_Dsch_Date BETWEEN @SD AND @ED
AND B_Adm_Src_Desc != 'Scheduled Admission'
AND B_Pt_No < '20000000'
)
SELECT * FROM cte WHERE rn = 1;