使用SAS proc expand填充缺少的值_Sas_Time Series

使用SAS proc expand填充缺少的值

sas

使用SAS proc expand填充缺少的值,sas,time-series,Sas,Time Series,我有以下问题：我想用proc expand来填充缺少的值，而不是简单地从下一个数据行获取值我的数据如下所示： date;index; 29.Jun09;-1693 30.Jun09;-1692 01.Jul09;-1691 02.Jul09;-1690 03.Jul09;-1689 04.Jul09;. 05.Jul09;. 06.Jul09;-1688 07.Jul09;-1687 08.Jul09;-1686 09.Jul09;-1685 10.Jul09;-1684 11.Jul09;

我有以下问题：

我想用proc expand来填充缺少的值，而不是简单地从下一个数据行获取值

我的数据如下所示：

date;index;
29.Jun09;-1693
30.Jun09;-1692
01.Jul09;-1691
02.Jul09;-1690
03.Jul09;-1689
04.Jul09;.
05.Jul09;.
06.Jul09;-1688
07.Jul09;-1687
08.Jul09;-1686
09.Jul09;-1685
10.Jul09;-1684
11.Jul09;.
12.Jul09;.
13.Jul09;-1683

正如您可以看到的，对于某些日期，索引缺失。我希望实现以下目标：

date;index;
29.Jun09;-1693
30.Jun09;-1692
01.Jul09;-1691
02.Jul09;-1690
03.Jul09;-1689
04.Jul09;-1688
05.Jul09;-1688
06.Jul09;-1688
07.Jul09;-1687
08.Jul09;-1686
09.Jul09;-1685
10.Jul09;-1684
11.Jul09;-1683
12.Jul09;-1683
13.Jul09;-1683

如您所见，缺失数据的值取自下一行（2009年7月11日和2009年7月12日的值取自2009年7月13日）

所以proc expand似乎是正确的方法，我开始使用以下代码：

PROC EXPAND DATA=DUMMY
OUT=WORK.DUMMY_TS
FROM = DAY
ALIGN = BEGINNING
METHOD = STEP
OBSERVED = (BEGINNING, BEGINNING);

ID date;
CONVERT index /;
RUN;
QUIT;

这填补了空白，但从上一行以及我为对齐、观察甚至排序数据设置的任何内容来看，我没有实现我想要的行为

如果你知道如何纠正，如果你能给我一个提示就太好了。关于过程扩展的优秀论文也受到了赞赏

谢谢你的帮助和亲切的问候

斯蒂芬

我不知道proc expand。但显然，这可以通过几个步骤来实现

读取数据集并创建一个新变量，该变量的值为n

按此新变量按降序排列此数据集

proc sort data=have;
    by descending pos;
run;

使用Lag或retain填充“next”行中缺少的值（排序后，顺序将颠倒）

如果需要，请进行排序

proc sort data=want;
    by pos;
run;

我不知道proc扩展。但显然，这可以通过几个步骤来实现

读取数据集并创建一个新变量，该变量的值为n

按此新变量按降序排列此数据集

proc sort data=have;
    by descending pos;
run;

使用Lag或retain填充“next”行中缺少的值（排序后，顺序将颠倒）

如果需要，请进行排序

proc sort data=want;
    by pos;
run;

我不是过程扩展专家，但这就是我想到的。为最大间隙运行（2）创建潜在客户，然后将其合并到索引中

data index;
   infile cards dsd dlm=';';
   input date:date11. index;
   format date date11.;
   cards4;
29.Jun09;-1693
30.Jun09;-1692
01.Jul09;-1691
02.Jul09;-1690
03.Jul09;-1689
04.Jul09;.
05.Jul09;.
06.Jul09;-1688
07.Jul09;-1687
08.Jul09;-1686
09.Jul09;-1685
10.Jul09;-1684
11.Jul09;.
12.Jul09;.
13.Jul09;-1683
;;;;
   run;
proc print;
   run;
PROC EXPAND DATA=index OUT=index2 method=none;
   ID date;
   convert index=lead1 / transform=(lead 1);
   CONVERT index=lead2 / transform=(lead 2);
   RUN;
   QUIT;
proc print; 
   run;
data index3;
   set index2;
   pocb = coalesce(index,lead1,lead2);
   run;
proc print;
   run;

修改后适用于任何合理的间隙尺寸

data index;
   infile cards dsd dlm=';';
   input date:date11. index;
   format date date11.;
   cards4;
27.Jun09;
28.Jun09;
29.Jun09;-1693
30.Jun09;-1692
01.Jul09;-1691
02.Jul09;-1690
03.Jul09;-1689
04.Jul09;.
05.Jul09;.
06.Jul09;-1688
07.Jul09;-1687
08.Jul09;-1686
09.Jul09;-1685
10.Jul09;-1684
11.Jul09;.
12.Jul09;.
13.Jul09;-1683
14.Jul09;
15.Jul09;
16.Jul09;
17.Jul09;-1694
;;;;
   run;
proc print;
   run;
/* find the largest gap */
data gapsize(keep=n);
   set index;
   by index notsorted;
   if missing(index) then do;
      if first.index then n=0;
      n+1;
      if last.index then output;
      end;
   run;
proc summary data=gapsize;
   output out=maxgap(drop=_:) max(n)=maxgap;
   run;
/* Gen the convert statement for LEADs */
filename FT67F001 temp;
data _null_;
   file FT67F001;
   set maxgap;
   do i = 1 to maxgap;
      put 'Convert index=lead' i ' / transform=(lead ' i ');';
      end;
   stop;
   run;
proc expand data=index out=index2 method=none;
   id date;
   %inc ft67f001;
   run;
   quit;
data index3;
   set index2;
   pocb = coalesce(index,of lead:);
   drop lead:;
   run;
proc print;
   run;

我不是过程扩展专家，但这就是我想到的。为最大间隙运行（2）创建潜在客户，然后将其合并到索引中

data index;
   infile cards dsd dlm=';';
   input date:date11. index;
   format date date11.;
   cards4;
29.Jun09;-1693
30.Jun09;-1692
01.Jul09;-1691
02.Jul09;-1690
03.Jul09;-1689
04.Jul09;.
05.Jul09;.
06.Jul09;-1688
07.Jul09;-1687
08.Jul09;-1686
09.Jul09;-1685
10.Jul09;-1684
11.Jul09;.
12.Jul09;.
13.Jul09;-1683
;;;;
   run;
proc print;
   run;
PROC EXPAND DATA=index OUT=index2 method=none;
   ID date;
   convert index=lead1 / transform=(lead 1);
   CONVERT index=lead2 / transform=(lead 2);
   RUN;
   QUIT;
proc print; 
   run;
data index3;
   set index2;
   pocb = coalesce(index,lead1,lead2);
   run;
proc print;
   run;

修改后适用于任何合理的间隙尺寸

data index;
   infile cards dsd dlm=';';
   input date:date11. index;
   format date date11.;
   cards4;
27.Jun09;
28.Jun09;
29.Jun09;-1693
30.Jun09;-1692
01.Jul09;-1691
02.Jul09;-1690
03.Jul09;-1689
04.Jul09;.
05.Jul09;.
06.Jul09;-1688
07.Jul09;-1687
08.Jul09;-1686
09.Jul09;-1685
10.Jul09;-1684
11.Jul09;.
12.Jul09;.
13.Jul09;-1683
14.Jul09;
15.Jul09;
16.Jul09;
17.Jul09;-1694
;;;;
   run;
proc print;
   run;
/* find the largest gap */
data gapsize(keep=n);
   set index;
   by index notsorted;
   if missing(index) then do;
      if first.index then n=0;
      n+1;
      if last.index then output;
      end;
   run;
proc summary data=gapsize;
   output out=maxgap(drop=_:) max(n)=maxgap;
   run;
/* Gen the convert statement for LEADs */
filename FT67F001 temp;
data _null_;
   file FT67F001;
   set maxgap;
   do i = 1 to maxgap;
      put 'Convert index=lead' i ' / transform=(lead ' i ');';
      end;
   stop;
   run;
proc expand data=index out=index2 method=none;
   id date;
   %inc ft67f001;
   run;
   quit;
data index3;
   set index2;
   pocb = coalesce(index,of lead:);
   drop lead:;
   run;
proc print;
   run;

感谢您提供详细的解决方案-它工作正常，动态lead方法也很好（我以前没有见过）。然而，由于兰姆更优雅的态度，我认为他的回答是正确的。再次感谢你的帮助！Stephanthanks感谢您的详细解决方案-它工作良好，动态lead方法很好（我以前没有见过）。然而，由于兰姆更优雅的态度，我认为他的回答是正确的。再次感谢你的帮助！StephanJust向我更基础级别的程序员解释：这也是一种简单的最后观察结转方法——我省略了pos变量和排序步骤。向我更基础级别的程序员解释：这也是一种简单的最后观察结转方法——我只是省略了pos变量和排序步骤。