Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/azure/12.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
在SAS中组合具有重叠数据范围的行_Sas - Fatal编程技术网

在SAS中组合具有重叠数据范围的行

在SAS中组合具有重叠数据范围的行,sas,Sas,由于我是SAS的新手,我需要一些帮助来了解如何将重叠日期范围合并为一行。我希望在重叠日期范围具有匹配Id时将其合并。如果日期不重叠,则我希望保持不变。如果他们通过匹配身份证和药物代码来重叠,那么应该合并成一行。请查看我下面的相同ple数据集和预期结果: Current Data set: ID Drug Code BEG_Date End_Date 1 100 1/1/2018 1

由于我是SAS的新手,我需要一些帮助来了解如何将重叠日期范围合并为一行。我希望在重叠日期范围具有匹配Id时将其合并。如果日期不重叠,则我希望保持不变。如果他们通过匹配身份证和药物代码来重叠,那么应该合并成一行。请查看我下面的相同ple数据集和预期结果:

            Current Data set:           
            ID  Drug Code   BEG_Date    End_Date
            1   100 1/1/2018    1/1/2019
            1   100 1/1/2018    3/1/2018
            1   100 2/1/2018    04/30/2018
            1   90  4/1/2018    04/30/2018
            1   100 5/1/2018    6/1/2018
            1   98  6/1/2018    8/31/2018
            1   100 9/1/2018    5/4/2019

            Expected results:           
            ID  Drug Code   BEG_Date    End_Date
            1   100 1/1/2018    3/31/2018
            1   90  4/1/2018    04/30/2018
            1   100 5/1/2018    6/1/2018
            1   98  6/2/2018    8/31/2018
            1   100 9/1/2018    5/4/2019

            I wrote some SAS code but I am combining the dates even when there is no overlap. I want to write some code which should work in SAS.           

            PROC SORT DATA=Want OUT=ONE;            
                BY PERSON_ID BEG_DATE DRUG_CODE END_DATE;       
            RUN;            
            data TWO (DROP=PERSON_ID2 DRUG_CODE2 BEG_DATE END_DATE          
                RENAME=(BEG2=BEG_DOS        
                END2=END_DOS));     
                SET ONE;        
                RETAIN BEG2 END2;       
                PERSON_ID2=LAG1(PERSON_ID);     
                DRUG_CODE2=LAG1(DRUG_CODE);     

                IF PERSON_ID2=PERSON_ID AND DRUG_CODE2=DRUG_CODE AND BEG_DATE LE(END2+1) THEN       
                    DO; 
                        BEG2=MIN(BEG_DATE,BEG2);
                        END2=MAX(END_DATE,END2);
                    END;    
                ELSE        
                    DO; 
                        SEG+1;
                        BEG2=BEG_DATE;
                        END2=END_DATE;
                    END;    

                FORMAT BEG2 END2 MMDDYY10.;     
            RUN;            

            DATA THREE(DROP=BEG_DOS END_DOS SEG);           
                RETAIN BEG_DATE END_DATE;       
                SET TWO;        
                BY PERSON_ID SEG;       
                FORMAT BEG_DATE END_DATE MMDDYY10.;     

                IF FIRST.SEG THEN       
                    DO; 
                        BEG_DATE=BEG_DOS;
                    END;    

                IF LAST.SEG THEN        
                    DO; 
                        END_DATE = END_DOS;
                        OUTPUT;
                    END;    
            RUN;    

我会这样做的。为每个ID和日期创建obs。标记差距并按运行进行总结

            Current Data set:           
            ID  Drug Code   BEG_Date    End_Date
            1   100 1/1/2018    1/1/2019
            1   100 1/1/2018    3/1/2018
            1   100 2/1/2018    04/30/2018
            1   90  4/1/2018    04/30/2018
            1   100 5/1/2018    6/1/2018
            1   98  6/1/2018    8/31/2018
            1   100 9/1/2018    5/4/2019

            Expected results:           
            ID  Drug Code   BEG_Date    End_Date
            1   100 1/1/2018    3/31/2018
            1   90  4/1/2018    04/30/2018
            1   100 5/1/2018    6/1/2018
            1   98  6/2/2018    8/31/2018
            1   100 9/1/2018    5/4/2019

            I wrote some SAS code but I am combining the dates even when there is no overlap. I want to write some code which should work in SAS.           

            PROC SORT DATA=Want OUT=ONE;            
                BY PERSON_ID BEG_DATE DRUG_CODE END_DATE;       
            RUN;            
            data TWO (DROP=PERSON_ID2 DRUG_CODE2 BEG_DATE END_DATE          
                RENAME=(BEG2=BEG_DOS        
                END2=END_DOS));     
                SET ONE;        
                RETAIN BEG2 END2;       
                PERSON_ID2=LAG1(PERSON_ID);     
                DRUG_CODE2=LAG1(DRUG_CODE);     

                IF PERSON_ID2=PERSON_ID AND DRUG_CODE2=DRUG_CODE AND BEG_DATE LE(END2+1) THEN       
                    DO; 
                        BEG2=MIN(BEG_DATE,BEG2);
                        END2=MAX(END_DATE,END2);
                    END;    
                ELSE        
                    DO; 
                        SEG+1;
                        BEG2=BEG_DATE;
                        END2=END_DATE;
                    END;    

                FORMAT BEG2 END2 MMDDYY10.;     
            RUN;            

            DATA THREE(DROP=BEG_DOS END_DOS SEG);           
                RETAIN BEG_DATE END_DATE;       
                SET TWO;        
                BY PERSON_ID SEG;       
                FORMAT BEG_DATE END_DATE MMDDYY10.;     

                IF FIRST.SEG THEN       
                    DO; 
                        BEG_DATE=BEG_DOS;
                    END;    

                IF LAST.SEG THEN        
                    DO; 
                        END_DATE = END_DOS;
                        OUTPUT;
                    END;    
            RUN;    
data have;
   input ID  Drug_Code   (BEG End)(:mmddyy.);
   format BEG End mmddyyd10.;
   cards;
1   100 1/1/2018    3/1/2018
1   100 2/1/2018    04/30/2018
1   90  4/1/2018    04/30/2018
1   90  6/1/2018    8/15/2018
1   100 5/1/2018    6/1/2018
1   98  6/1/2018    8/31/2018
1   100 9/1/2018    5/4/2019
;;;;
   run;
proc print;
   run;
/*1   100 1/1/2018    1/1/2019*/

data exv/ view=exv;
   set have;
   do date = beg to end;
      output;
      end;
   drop beg end;
   format date mmddyyd10.;
   run;
proc sort data=exv out=ex nodupkey;
   by id drug_code date;
   run;
data breaksV / view=BreaksV;
   set ex;
   by id drug_code;
   dif = dif(date);
   if first.drug_code then do;  dif=1; run=1; end;
   if dif ne 1 then run+1;
   run;
proc summary data=breaksV nway missing;
   class id drug_code run;
   var date;
   output out=want(drop=_type_) min=Begin max=End;
   run;
Proc print;
   run;

计算由重叠段范围组成的范围需要充分了解范围条件(情况)

            Current Data set:           
            ID  Drug Code   BEG_Date    End_Date
            1   100 1/1/2018    1/1/2019
            1   100 1/1/2018    3/1/2018
            1   100 2/1/2018    04/30/2018
            1   90  4/1/2018    04/30/2018
            1   100 5/1/2018    6/1/2018
            1   98  6/1/2018    8/31/2018
            1   100 9/1/2018    5/4/2019

            Expected results:           
            ID  Drug Code   BEG_Date    End_Date
            1   100 1/1/2018    3/31/2018
            1   90  4/1/2018    04/30/2018
            1   100 5/1/2018    6/1/2018
            1   98  6/2/2018    8/31/2018
            1   100 9/1/2018    5/4/2019

            I wrote some SAS code but I am combining the dates even when there is no overlap. I want to write some code which should work in SAS.           

            PROC SORT DATA=Want OUT=ONE;            
                BY PERSON_ID BEG_DATE DRUG_CODE END_DATE;       
            RUN;            
            data TWO (DROP=PERSON_ID2 DRUG_CODE2 BEG_DATE END_DATE          
                RENAME=(BEG2=BEG_DOS        
                END2=END_DOS));     
                SET ONE;        
                RETAIN BEG2 END2;       
                PERSON_ID2=LAG1(PERSON_ID);     
                DRUG_CODE2=LAG1(DRUG_CODE);     

                IF PERSON_ID2=PERSON_ID AND DRUG_CODE2=DRUG_CODE AND BEG_DATE LE(END2+1) THEN       
                    DO; 
                        BEG2=MIN(BEG_DATE,BEG2);
                        END2=MAX(END_DATE,END2);
                    END;    
                ELSE        
                    DO; 
                        SEG+1;
                        BEG2=BEG_DATE;
                        END2=END_DATE;
                    END;    

                FORMAT BEG2 END2 MMDDYY10.;     
            RUN;            

            DATA THREE(DROP=BEG_DOS END_DOS SEG);           
                RETAIN BEG_DATE END_DATE;       
                SET TWO;        
                BY PERSON_ID SEG;       
                FORMAT BEG_DATE END_DATE MMDDYY10.;     

                IF FIRST.SEG THEN       
                    DO; 
                        BEG_DATE=BEG_DOS;
                    END;    

                IF LAST.SEG THEN        
                    DO; 
                        END_DATE = END_DOS;
                        OUTPUT;
                    END;    
            RUN;    
按开始日期排序时,请考虑场景(在任何较大的分组集中,如
id
drug

            Current Data set:           
            ID  Drug Code   BEG_Date    End_Date
            1   100 1/1/2018    1/1/2019
            1   100 1/1/2018    3/1/2018
            1   100 2/1/2018    04/30/2018
            1   90  4/1/2018    04/30/2018
            1   100 5/1/2018    6/1/2018
            1   98  6/1/2018    8/31/2018
            1   100 9/1/2018    5/4/2019

            Expected results:           
            ID  Drug Code   BEG_Date    End_Date
            1   100 1/1/2018    3/31/2018
            1   90  4/1/2018    04/30/2018
            1   100 5/1/2018    6/1/2018
            1   98  6/2/2018    8/31/2018
            1   100 9/1/2018    5/4/2019

            I wrote some SAS code but I am combining the dates even when there is no overlap. I want to write some code which should work in SAS.           

            PROC SORT DATA=Want OUT=ONE;            
                BY PERSON_ID BEG_DATE DRUG_CODE END_DATE;       
            RUN;            
            data TWO (DROP=PERSON_ID2 DRUG_CODE2 BEG_DATE END_DATE          
                RENAME=(BEG2=BEG_DOS        
                END2=END_DOS));     
                SET ONE;        
                RETAIN BEG2 END2;       
                PERSON_ID2=LAG1(PERSON_ID);     
                DRUG_CODE2=LAG1(DRUG_CODE);     

                IF PERSON_ID2=PERSON_ID AND DRUG_CODE2=DRUG_CODE AND BEG_DATE LE(END2+1) THEN       
                    DO; 
                        BEG2=MIN(BEG_DATE,BEG2);
                        END2=MAX(END_DATE,END2);
                    END;    
                ELSE        
                    DO; 
                        SEG+1;
                        BEG2=BEG_DATE;
                        END2=END_DATE;
                    END;    

                FORMAT BEG2 END2 MMDDYY10.;     
            RUN;            

            DATA THREE(DROP=BEG_DOS END_DOS SEG);           
                RETAIN BEG_DATE END_DATE;       
                SET TWO;        
                BY PERSON_ID SEG;       
                FORMAT BEG_DATE END_DATE MMDDYY10.;     

                IF FIRST.SEG THEN       
                    DO; 
                        BEG_DATE=BEG_DOS;
                    END;    

                IF LAST.SEG THEN        
                    DO; 
                        END_DATE = END_DOS;
                        OUTPUT;
                    END;    
            RUN;    
  • [
    ]
    成为范围的端点
  • #
  • 范围
    是增长的组合范围
  • 是当前行中的范围
案例1增长。在G段开始之前范围结束

            Current Data set:           
            ID  Drug Code   BEG_Date    End_Date
            1   100 1/1/2018    1/1/2019
            1   100 1/1/2018    3/1/2018
            1   100 2/1/2018    04/30/2018
            1   90  4/1/2018    04/30/2018
            1   100 5/1/2018    6/1/2018
            1   98  6/1/2018    8/31/2018
            1   100 9/1/2018    5/4/2019

            Expected results:           
            ID  Drug Code   BEG_Date    End_Date
            1   100 1/1/2018    3/31/2018
            1   90  4/1/2018    04/30/2018
            1   100 5/1/2018    6/1/2018
            1   98  6/2/2018    8/31/2018
            1   100 9/1/2018    5/4/2019

            I wrote some SAS code but I am combining the dates even when there is no overlap. I want to write some code which should work in SAS.           

            PROC SORT DATA=Want OUT=ONE;            
                BY PERSON_ID BEG_DATE DRUG_CODE END_DATE;       
            RUN;            
            data TWO (DROP=PERSON_ID2 DRUG_CODE2 BEG_DATE END_DATE          
                RENAME=(BEG2=BEG_DOS        
                END2=END_DOS));     
                SET ONE;        
                RETAIN BEG2 END2;       
                PERSON_ID2=LAG1(PERSON_ID);     
                DRUG_CODE2=LAG1(DRUG_CODE);     

                IF PERSON_ID2=PERSON_ID AND DRUG_CODE2=DRUG_CODE AND BEG_DATE LE(END2+1) THEN       
                    DO; 
                        BEG2=MIN(BEG_DATE,BEG2);
                        END2=MAX(END_DATE,END2);
                    END;    
                ELSE        
                    DO; 
                        SEG+1;
                        BEG2=BEG_DATE;
                        END2=END_DATE;
                    END;    

                FORMAT BEG2 END2 MMDDYY10.;     
            RUN;            

            DATA THREE(DROP=BEG_DOS END_DOS SEG);           
                RETAIN BEG_DATE END_DATE;       
                SET TWO;        
                BY PERSON_ID SEG;       
                FORMAT BEG_DATE END_DATE MMDDYY10.;     

                IF FIRST.SEG THEN       
                    DO; 
                        BEG_DATE=BEG_DOS;
                    END;    

                IF LAST.SEG THEN        
                    DO; 
                        END_DATE = END_DOS;
                        OUTPUT;
                    END;    
            RUN;    
段将不影响范围或扩展范围

            Current Data set:           
            ID  Drug Code   BEG_Date    End_Date
            1   100 1/1/2018    1/1/2019
            1   100 1/1/2018    3/1/2018
            1   100 2/1/2018    04/30/2018
            1   90  4/1/2018    04/30/2018
            1   100 5/1/2018    6/1/2018
            1   98  6/1/2018    8/31/2018
            1   100 9/1/2018    5/4/2019

            Expected results:           
            ID  Drug Code   BEG_Date    End_Date
            1   100 1/1/2018    3/31/2018
            1   90  4/1/2018    04/30/2018
            1   100 5/1/2018    6/1/2018
            1   98  6/2/2018    8/31/2018
            1   100 9/1/2018    5/4/2019

            I wrote some SAS code but I am combining the dates even when there is no overlap. I want to write some code which should work in SAS.           

            PROC SORT DATA=Want OUT=ONE;            
                BY PERSON_ID BEG_DATE DRUG_CODE END_DATE;       
            RUN;            
            data TWO (DROP=PERSON_ID2 DRUG_CODE2 BEG_DATE END_DATE          
                RENAME=(BEG2=BEG_DOS        
                END2=END_DOS));     
                SET ONE;        
                RETAIN BEG2 END2;       
                PERSON_ID2=LAG1(PERSON_ID);     
                DRUG_CODE2=LAG1(DRUG_CODE);     

                IF PERSON_ID2=PERSON_ID AND DRUG_CODE2=DRUG_CODE AND BEG_DATE LE(END2+1) THEN       
                    DO; 
                        BEG2=MIN(BEG_DATE,BEG2);
                        END2=MAX(END_DATE,END2);
                    END;    
                ELSE        
                    DO; 
                        SEG+1;
                        BEG2=BEG_DATE;
                        END2=END_DATE;
                    END;    

                FORMAT BEG2 END2 MMDDYY10.;     
            RUN;            

            DATA THREE(DROP=BEG_DOS END_DOS SEG);           
                RETAIN BEG_DATE END_DATE;       
                SET TWO;        
                BY PERSON_ID SEG;       
                FORMAT BEG_DATE END_DATE MMDDYY10.;     

                IF FIRST.SEG THEN       
                    DO; 
                        BEG_DATE=BEG_DOS;
                    END;    

                IF LAST.SEG THEN        
                    DO; 
                        END_DATE = END_DOS;
                        OUTPUT;
                    END;    
            RUN;    
   [####]       Extent
+    [#]        Segment range DOES NOT contribute
 --------
   [####]       Extent (do not output a row, still growing)

            Current Data set:           
            ID  Drug Code   BEG_Date    End_Date
            1   100 1/1/2018    1/1/2019
            1   100 1/1/2018    3/1/2018
            1   100 2/1/2018    04/30/2018
            1   90  4/1/2018    04/30/2018
            1   100 5/1/2018    6/1/2018
            1   98  6/1/2018    8/31/2018
            1   100 9/1/2018    5/4/2019

            Expected results:           
            ID  Drug Code   BEG_Date    End_Date
            1   100 1/1/2018    3/31/2018
            1   90  4/1/2018    04/30/2018
            1   100 5/1/2018    6/1/2018
            1   98  6/2/2018    8/31/2018
            1   100 9/1/2018    5/4/2019

            I wrote some SAS code but I am combining the dates even when there is no overlap. I want to write some code which should work in SAS.           

            PROC SORT DATA=Want OUT=ONE;            
                BY PERSON_ID BEG_DATE DRUG_CODE END_DATE;       
            RUN;            
            data TWO (DROP=PERSON_ID2 DRUG_CODE2 BEG_DATE END_DATE          
                RENAME=(BEG2=BEG_DOS        
                END2=END_DOS));     
                SET ONE;        
                RETAIN BEG2 END2;       
                PERSON_ID2=LAG1(PERSON_ID);     
                DRUG_CODE2=LAG1(DRUG_CODE);     

                IF PERSON_ID2=PERSON_ID AND DRUG_CODE2=DRUG_CODE AND BEG_DATE LE(END2+1) THEN       
                    DO; 
                        BEG2=MIN(BEG_DATE,BEG2);
                        END2=MAX(END_DATE,END2);
                    END;    
                ELSE        
                    DO; 
                        SEG+1;
                        BEG2=BEG_DATE;
                        END2=END_DATE;
                    END;    

                FORMAT BEG2 END2 MMDDYY10.;     
            RUN;            

            DATA THREE(DROP=BEG_DOS END_DOS SEG);           
                RETAIN BEG_DATE END_DATE;       
                SET TWO;        
                BY PERSON_ID SEG;       
                FORMAT BEG_DATE END_DATE MMDDYY10.;     

                IF FIRST.SEG THEN       
                    DO; 
                        BEG_DATE=BEG_DOS;
                    END;    

                IF LAST.SEG THEN        
                    DO; 
                        END_DATE = END_DOS;
                        OUTPUT;
                    END;    
            RUN;    
案例2终点。3种可能性:

            Current Data set:           
            ID  Drug Code   BEG_Date    End_Date
            1   100 1/1/2018    1/1/2019
            1   100 1/1/2018    3/1/2018
            1   100 2/1/2018    04/30/2018
            1   90  4/1/2018    04/30/2018
            1   100 5/1/2018    6/1/2018
            1   98  6/1/2018    8/31/2018
            1   100 9/1/2018    5/4/2019

            Expected results:           
            ID  Drug Code   BEG_Date    End_Date
            1   100 1/1/2018    3/31/2018
            1   90  4/1/2018    04/30/2018
            1   100 5/1/2018    6/1/2018
            1   98  6/2/2018    8/31/2018
            1   100 9/1/2018    5/4/2019

            I wrote some SAS code but I am combining the dates even when there is no overlap. I want to write some code which should work in SAS.           

            PROC SORT DATA=Want OUT=ONE;            
                BY PERSON_ID BEG_DATE DRUG_CODE END_DATE;       
            RUN;            
            data TWO (DROP=PERSON_ID2 DRUG_CODE2 BEG_DATE END_DATE          
                RENAME=(BEG2=BEG_DOS        
                END2=END_DOS));     
                SET ONE;        
                RETAIN BEG2 END2;       
                PERSON_ID2=LAG1(PERSON_ID);     
                DRUG_CODE2=LAG1(DRUG_CODE);     

                IF PERSON_ID2=PERSON_ID AND DRUG_CODE2=DRUG_CODE AND BEG_DATE LE(END2+1) THEN       
                    DO; 
                        BEG2=MIN(BEG_DATE,BEG2);
                        END2=MAX(END_DATE,END2);
                    END;    
                ELSE        
                    DO; 
                        SEG+1;
                        BEG2=BEG_DATE;
                        END2=END_DATE;
                    END;    

                FORMAT BEG2 END2 MMDDYY10.;     
            RUN;            

            DATA THREE(DROP=BEG_DOS END_DOS SEG);           
                RETAIN BEG_DATE END_DATE;       
                SET TWO;        
                BY PERSON_ID SEG;       
                FORMAT BEG_DATE END_DATE MMDDYY10.;     

                IF FIRST.SEG THEN       
                    DO; 
                        BEG_DATE=BEG_DOS;
                    END;    

                IF LAST.SEG THEN        
                    DO; 
                        END_DATE = END_DOS;
                        OUTPUT;
                    END;    
            RUN;    
  • 在G段开始之后区段结束
  • 下一个G到达(不同的
    id
    /
    药物组合)
  • 已到达数据的结尾
  • #2和#3可通过检查适当的
    last.
    标志进行测试

                Current Data set:           
                ID  Drug Code   BEG_Date    End_Date
                1   100 1/1/2018    1/1/2019
                1   100 1/1/2018    3/1/2018
                1   100 2/1/2018    04/30/2018
                1   90  4/1/2018    04/30/2018
                1   100 5/1/2018    6/1/2018
                1   98  6/1/2018    8/31/2018
                1   100 9/1/2018    5/4/2019
    
                Expected results:           
                ID  Drug Code   BEG_Date    End_Date
                1   100 1/1/2018    3/31/2018
                1   90  4/1/2018    04/30/2018
                1   100 5/1/2018    6/1/2018
                1   98  6/2/2018    8/31/2018
                1   100 9/1/2018    5/4/2019
    
                I wrote some SAS code but I am combining the dates even when there is no overlap. I want to write some code which should work in SAS.           
    
                PROC SORT DATA=Want OUT=ONE;            
                    BY PERSON_ID BEG_DATE DRUG_CODE END_DATE;       
                RUN;            
                data TWO (DROP=PERSON_ID2 DRUG_CODE2 BEG_DATE END_DATE          
                    RENAME=(BEG2=BEG_DOS        
                    END2=END_DOS));     
                    SET ONE;        
                    RETAIN BEG2 END2;       
                    PERSON_ID2=LAG1(PERSON_ID);     
                    DRUG_CODE2=LAG1(DRUG_CODE);     
    
                    IF PERSON_ID2=PERSON_ID AND DRUG_CODE2=DRUG_CODE AND BEG_DATE LE(END2+1) THEN       
                        DO; 
                            BEG2=MIN(BEG_DATE,BEG2);
                            END2=MAX(END_DATE,END2);
                        END;    
                    ELSE        
                        DO; 
                            SEG+1;
                            BEG2=BEG_DATE;
                            END2=END_DATE;
                        END;    
    
                    FORMAT BEG2 END2 MMDDYY10.;     
                RUN;            
    
                DATA THREE(DROP=BEG_DOS END_DOS SEG);           
                    RETAIN BEG_DATE END_DATE;       
                    SET TWO;        
                    BY PERSON_ID SEG;       
                    FORMAT BEG_DATE END_DATE MMDDYY10.;     
    
                    IF FIRST.SEG THEN       
                        DO; 
                            BEG_DATE=BEG_DOS;
                        END;    
    
                    IF LAST.SEG THEN        
                        DO; 
                            END_DATE = END_DOS;
                            OUTPUT;
                        END;    
                RUN;    
    
       [####]       Extent
    +        ..[#]  Segment beyond Extent (gap is 2)
     --------
       [####]       output Extent
               [#]  reset Extent to Segment
    
    您可以调整区段相邻的规则(间隙=0)或足够接近的规则(间隙<阈值),以表示区段被扩展,或输出并重置为区段

                Current Data set:           
                ID  Drug Code   BEG_Date    End_Date
                1   100 1/1/2018    1/1/2019
                1   100 1/1/2018    3/1/2018
                1   100 2/1/2018    04/30/2018
                1   90  4/1/2018    04/30/2018
                1   100 5/1/2018    6/1/2018
                1   98  6/1/2018    8/31/2018
                1   100 9/1/2018    5/4/2019
    
                Expected results:           
                ID  Drug Code   BEG_Date    End_Date
                1   100 1/1/2018    3/31/2018
                1   90  4/1/2018    04/30/2018
                1   100 5/1/2018    6/1/2018
                1   98  6/2/2018    8/31/2018
                1   100 9/1/2018    5/4/2019
    
                I wrote some SAS code but I am combining the dates even when there is no overlap. I want to write some code which should work in SAS.           
    
                PROC SORT DATA=Want OUT=ONE;            
                    BY PERSON_ID BEG_DATE DRUG_CODE END_DATE;       
                RUN;            
                data TWO (DROP=PERSON_ID2 DRUG_CODE2 BEG_DATE END_DATE          
                    RENAME=(BEG2=BEG_DOS        
                    END2=END_DOS));     
                    SET ONE;        
                    RETAIN BEG2 END2;       
                    PERSON_ID2=LAG1(PERSON_ID);     
                    DRUG_CODE2=LAG1(DRUG_CODE);     
    
                    IF PERSON_ID2=PERSON_ID AND DRUG_CODE2=DRUG_CODE AND BEG_DATE LE(END2+1) THEN       
                        DO; 
                            BEG2=MIN(BEG_DATE,BEG2);
                            END2=MAX(END_DATE,END2);
                        END;    
                    ELSE        
                        DO; 
                            SEG+1;
                            BEG2=BEG_DATE;
                            END2=END_DATE;
                        END;    
    
                    FORMAT BEG2 END2 MMDDYY10.;     
                RUN;            
    
                DATA THREE(DROP=BEG_DOS END_DOS SEG);           
                    RETAIN BEG_DATE END_DATE;       
                    SET TWO;        
                    BY PERSON_ID SEG;       
                    FORMAT BEG_DATE END_DATE MMDDYY10.;     
    
                    IF FIRST.SEG THEN       
                        DO; 
                            BEG_DATE=BEG_DOS;
                        END;    
    
                    IF LAST.SEG THEN        
                        DO; 
                            END_DATE = END_DOS;
                            OUTPUT;
                        END;    
                RUN;    
    
    注:对于以下真实情况,情况稍微复杂一些(未显示):

                Current Data set:           
                ID  Drug Code   BEG_Date    End_Date
                1   100 1/1/2018    1/1/2019
                1   100 1/1/2018    3/1/2018
                1   100 2/1/2018    04/30/2018
                1   90  4/1/2018    04/30/2018
                1   100 5/1/2018    6/1/2018
                1   98  6/1/2018    8/31/2018
                1   100 9/1/2018    5/4/2019
    
                Expected results:           
                ID  Drug Code   BEG_Date    End_Date
                1   100 1/1/2018    3/31/2018
                1   90  4/1/2018    04/30/2018
                1   100 5/1/2018    6/1/2018
                1   98  6/2/2018    8/31/2018
                1   100 9/1/2018    5/4/2019
    
                I wrote some SAS code but I am combining the dates even when there is no overlap. I want to write some code which should work in SAS.           
    
                PROC SORT DATA=Want OUT=ONE;            
                    BY PERSON_ID BEG_DATE DRUG_CODE END_DATE;       
                RUN;            
                data TWO (DROP=PERSON_ID2 DRUG_CODE2 BEG_DATE END_DATE          
                    RENAME=(BEG2=BEG_DOS        
                    END2=END_DOS));     
                    SET ONE;        
                    RETAIN BEG2 END2;       
                    PERSON_ID2=LAG1(PERSON_ID);     
                    DRUG_CODE2=LAG1(DRUG_CODE);     
    
                    IF PERSON_ID2=PERSON_ID AND DRUG_CODE2=DRUG_CODE AND BEG_DATE LE(END2+1) THEN       
                        DO; 
                            BEG2=MIN(BEG_DATE,BEG2);
                            END2=MAX(END_DATE,END2);
                        END;    
                    ELSE        
                        DO; 
                            SEG+1;
                            BEG2=BEG_DATE;
                            END2=END_DATE;
                        END;    
    
                    FORMAT BEG2 END2 MMDDYY10.;     
                RUN;            
    
                DATA THREE(DROP=BEG_DOS END_DOS SEG);           
                    RETAIN BEG_DATE END_DATE;       
                    SET TWO;        
                    BY PERSON_ID SEG;       
                    FORMAT BEG_DATE END_DATE MMDDYY10.;     
    
                    IF FIRST.SEG THEN       
                        DO; 
                            BEG_DATE=BEG_DOS;
                        END;    
    
                    IF LAST.SEG THEN        
                        DO; 
                            END_DATE = END_DOS;
                            OUTPUT;
                        END;    
                RUN;    
    
    • 缺少起始日期意味着该段有一个未知的起始日期(假定它是历元(0=1960年1月1日,或数据或研究中所有日期之前的某个日期)
    • 缺少结束表示该段今天处于活动状态(结束日期是处理数据的日期)
    示例代码:

                Current Data set:           
                ID  Drug Code   BEG_Date    End_Date
                1   100 1/1/2018    1/1/2019
                1   100 1/1/2018    3/1/2018
                1   100 2/1/2018    04/30/2018
                1   90  4/1/2018    04/30/2018
                1   100 5/1/2018    6/1/2018
                1   98  6/1/2018    8/31/2018
                1   100 9/1/2018    5/4/2019
    
                Expected results:           
                ID  Drug Code   BEG_Date    End_Date
                1   100 1/1/2018    3/31/2018
                1   90  4/1/2018    04/30/2018
                1   100 5/1/2018    6/1/2018
                1   98  6/2/2018    8/31/2018
                1   100 9/1/2018    5/4/2019
    
                I wrote some SAS code but I am combining the dates even when there is no overlap. I want to write some code which should work in SAS.           
    
                PROC SORT DATA=Want OUT=ONE;            
                    BY PERSON_ID BEG_DATE DRUG_CODE END_DATE;       
                RUN;            
                data TWO (DROP=PERSON_ID2 DRUG_CODE2 BEG_DATE END_DATE          
                    RENAME=(BEG2=BEG_DOS        
                    END2=END_DOS));     
                    SET ONE;        
                    RETAIN BEG2 END2;       
                    PERSON_ID2=LAG1(PERSON_ID);     
                    DRUG_CODE2=LAG1(DRUG_CODE);     
    
                    IF PERSON_ID2=PERSON_ID AND DRUG_CODE2=DRUG_CODE AND BEG_DATE LE(END2+1) THEN       
                        DO; 
                            BEG2=MIN(BEG_DATE,BEG2);
                            END2=MAX(END_DATE,END2);
                        END;    
                    ELSE        
                        DO; 
                            SEG+1;
                            BEG2=BEG_DATE;
                            END2=END_DATE;
                        END;    
    
                    FORMAT BEG2 END2 MMDDYY10.;     
                RUN;            
    
                DATA THREE(DROP=BEG_DOS END_DOS SEG);           
                    RETAIN BEG_DATE END_DATE;       
                    SET TWO;        
                    BY PERSON_ID SEG;       
                    FORMAT BEG_DATE END_DATE MMDDYY10.;     
    
                    IF FIRST.SEG THEN       
                        DO; 
                            BEG_DATE=BEG_DOS;
                        END;    
    
                    IF LAST.SEG THEN        
                        DO; 
                            END_DATE = END_DOS;
                            OUTPUT;
                        END;    
                RUN;    
    
    data have;
      call streaminit(42);
      do id = 1 to 10;
        do _n_ = 1 to 50;
          drug = ceil(rand('UNIFORM', 10));
          beg_date = intnx ('MONTH', '01JAN2008'D, rand('UNIFORM',20));
          end_date = intnx ('DAY', beg_date, rand('UNIFORM',75));
          OUTPUT;
    
        end;
      end;
      format beg_date end_date yymmdd10.;
    run;
    
    proc sort data=have out=segments;
      by id drug beg_date end_date;
    run;
    
    data want;
      set segments;
      by id drug beg_date end_date;  * will error if incoming data is NOT sorted;
    
      retain ext_beg ext_end;
      retain gap_allowed 0; * set to 1 for contiguously adjacent segment ;
    
      if first.drug then do;
        ext_beg = beg_date;
        ext_end = end_date;
        segment_count = 0;
      end;
    
      if beg_date <= ext_end + gap_allowed then do;
        ext_end = max (ext_end, end_date);
        segment_count + 1;
      end;
      else do;
        extent_id + 1;
        OUTPUT;
        ext_beg = beg_date;
        ext_end = end_date;
        segment_count = 1;
      end;
    
      if last.drug then do;
        extent_id + 1;
        OUTPUT;
        * reset occurs implicitly;
        * it will happen at first. logic when control returns to top of step;
      end;
    
      format ext_: yymmdd10.;
      keep id drug ext_beg ext_end segment_count extent_id;
    run;
    
    数据已经存在;
    调用streaminit(42);
    do id=1到10;
    do=1至50;
    药物=ceil(兰特('UNIFORM',10));
    beg_date=intnx('月份','2008年1月1日',兰特('统一',20));
    结束日期=intnx('DAY',beg_日期,兰特('UNIFORM',75));
    产出;
    结束;
    结束;
    格式为日期结束日期yymmdd10。;
    跑
    proc sort data=have out=段;
    按id日期结束日期;
    跑
    数据需求;
    设置段;
    如果传入数据未排序,则按id\u date end\u date;*将出错;
    保留ext_beg ext_end;
    保留间隙_允许为0;*对于连续相邻的管段设置为1;
    如果先吸毒,然后再吸毒;
    ext_beg=beg_日期;
    ext_end=结束日期;
    分段计数=0;
    结束;
    
    如果是beg_date,我有点困惑。
    End_date=2018年3月31日
    在您预期结果的第一行来自哪里?为什么您要删除显示2018年所有药物100的第一行输入?为什么这不会与药物100的所有其他记录重叠?我想根据其他药物代码90更改结束日期它的beg日期为2019年1月4日。如果药物代码之间存在任何重叠,那么我想将其分解为所有药物的单一日期范围,以丢失最小数据。看起来,当药物发生变化时,您希望将前一种药物的结束日期截短为新药开始前一天?对吗?因此您不会有人在谈论多种药物吗?比如哮喘药物和血压药物?你使用的是什么日历?4月31日?这是愚人节的数据吗?请用真实的样本数据更新问题。感谢您的快速回答,但我仍然看到不同药物代码的输出重叠。此外,药物代码100没有重叠从2018年1月2日到2018年4月30日,下一排是从2018年1月5日到2018年1月6日。实际上,两者都没有重叠。乞讨日期不会落在结束日期之下。显然我不明白问题所在,也不明白你在说什么。