Arrays 基本SAS:按日期变量转换计数

Arrays 基本SAS:按日期变量转换计数,arrays,date,sas,transpose,Arrays,Date,Sas,Transpose,我需要创建一个汇总数据集/报告,用于跟踪这些购买的流程。我有一个数据集,用于提供总体服务的注册日期和9个变量,用于提供不同附加产品的购买日期。如果附加模块变量日期与注册日期匹配,则这些附加模块产品将包含在注册包中。注册日期之后的任何附加可变购买日期都是在活动帐户历史记录期间购买的产品。这就是它看起来的样子: data have ; length ID 8 signup_DT 8 preferredhd_tv_estbd_dt 8 ultimate_estbd_dt 8 quan

我需要创建一个汇总数据集/报告,用于跟踪这些购买的流程。我有一个数据集,用于提供总体服务的注册日期和9个变量,用于提供不同附加产品的购买日期。如果附加模块变量日期与注册日期匹配,则这些附加模块产品将包含在注册包中。注册日期之后的任何附加可变购买日期都是在活动帐户历史记录期间购买的产品。这就是它看起来的样子:

data have ;
 length ID 8 
  signup_DT 8 preferredhd_tv_estbd_dt 8 
  ultimate_estbd_dt 8   quant_estbd_dt 8    
  FullyLoaded_estbd_dt 8    HB_estbd_dt Cin_estbd_dt 8  
  time_estbd_dt 8   router_estbd_dt internet_estbd_dt 8; 
 INPUT ID 8 
  signup_DT  : anydtdte9.  preferredhd_tv_estbd_dt  : anydtdte9.    
  ultimate_estbd_dt  : anydtdte9.   quant_estbd_dt  : anydtdte9.    
  FullyLoaded_estbd_dt  : anydtdte9.    HB_estbd_dt Cin_estbd_dt  : anydtdte9.  
  time_estbd_dt  : anydtdte9.   router_estbd_dt internet_estbd_dt  : anydtdte9. ;;
 format   signup_DT  preferredhd_tv_estbd_dt    
  ultimate_estbd_dt     quant_estbd_dt  
  FullyLoaded_estbd_dt  HB_estbd_dt Cin_estbd_dt    
  time_estbd_dt     router_estbd_dt internet_estbd_dt   date9.; 
datalines; 
98663699    4/7/14  4/9/14  4/7/14  9/12/14 10/15/14 7/7/14 4/7/14  4/7/14  4/12/14 .
33663798    4/11/14 .   4/11/14 .   4/11/14 4/11/14 4/11/14 4/11/14 6/11/14 7/15/14
43663463    5/12/14 5/12/14 5/12/14 9/5/14  9/17/14 .   .   .   .   .
77661437    5/16/14 .   5/16/14 .   10/31/14    .   5/16/14 5/16/14 11/16/14    .
85662295    5/29/14 .   .   5/29/14 .   6/12/14 .   .   11/16/14    .
36656756    6/4/14  .   .   .   6/4/14  6/4/14  6/12/14 6/4/14  6/4/14  12/4/14
67662646    6/14/14 .   6/14/14 8/31/14 .   .   6/17/14 6/14/14 .   6/22/14
55663786    6/26/14 .   .   .   8/14/14 6/26/14 7/8/14  6/26/14 11/30/14    .
44663191    8/21/14 .   9/30/14 .   .   .   .   1/12/15 .   10/31/14
;  
我试图产生的变量是:

  • 注册月(容易做到)
  • 当月注册总数的计数(容易做到)
  • 包括注册在内的其他产品的总数
  • 具有所有附加产品值的变量(从原始数据集转置)
  • 在启动日期购买的不同产品的计数
  • 在注册日期后购买的附加产品计数,这些附加产品是在注册日期的同一个月内购买的 7.然后按月统计附加产品的月份变量
  • 如果仅以4月份为例,我希望得到的结果如下:

     data want ;
         length 
          Sign_up_Month $5
          Sign_up_count 8
          Initial_Products_total    8
          Products  $25
          Prod_Purchased_on_Signup  8
          AddPro_ April_After_SU 8
          May 8 June 8  July 8  August 8    September 8 October 8; 
         INPUT Sign_up_Month    $   
          Sign_up_count 
          Initial_Products_total    
          Products  $
          Prod_Purchased_on_Signup  
          AddPro_ April_After_SU    
          May   June    July    August  September   October; 
        datalines; 
        April   2   8   preferredhd_tv_estbd_dt     1                       
        April   2   8   ultimate_estbd_dt           2                           
        April   2   8   quant_estbd_dt                          1   
        April   2   8   FullyLoaded_estbd_dt    1                           1
        April   2   8   HB_estbd_dt            1                            
        April   2   8   Cin_estbd_dt    2                           
        April   2   8   time_estbd_dt   2                           
        April   2   8   router_estbd_dt     1       1               
        April   2   8   internet_estbd_dt                   1           
        ;
    
    下面是输出数据集中前三个变量的代码:signup\u month、Sign\u count、Initial\u Products\u total

    proc sort data=have; 
    by ID signup_DT; run; 
    proc transpose data=have out=have (drop=_LABEL_); 
    by ID signup_DT; run; 
    data have; 
    set have; 
    if signup_DT=COL1 then Initial_flag=1;run; 
    proc sql; 
    create table have as 
    select  distinct
    count( distinct ID) as Sign_up_count ,
    month (signup_DT) as signup_month, 
    sum (Initial_flag) as Initial_Products 
    from have
    group by month ( signup_DT) ; quit;
    
    我在创建剩余的VAR时遇到了问题:Prod_在注册时购买了,AddPro_uuu在注册后添加了April_u,并且按月计数


    我一直在尝试使用数组来实现这一点,但我遇到了麻烦。

    从您的问题中,我无法确定您希望计数达到的聚合级别。但如果您要查找每个不同ID和注册日期的摘要,这里有一个解决方案。这需要按
    ID signup\u DT
    对原始输入进行排序

    proc transpose 
        data = have 
        out = trans; 
        by ID signup_DT; 
    run; 
    
    /* Sort for by group processing and regular name order */
    proc sort data = trans;
        by ID signup_DT _NAME_;
    run;
    
    data products (drop = _NAME_ COL1 i);
        set trans;
        /* For by group processing */
        by ID signup_DT;
        /* Get the signup month as a word */
        signup_month = put(signup_DT, monname.);
        /* Make the product list variable to prevent truncation */
        length Products $400.;
        /* Retain so we can add to the variables as we go down through the group */
        retain Products Sign_up_count signups_month0-signups_month4;
        /* Set up array reference for later month counts so we can loop */
        array som[5] signups_month0-signups_month4;
        /* Reset out new variables */
        if first.signup_DT then do;
            Products = "";
            Sign_up_count = 0;
            do i = 1 to 5;
                som[i] = 0;
            end;
        end;
        /* Add to the listt and count of sign up products */
        if signup_DT = COL1 then do;
            Sign_up_count + 1;
            Products = catx(" ", Products, _NAME_);
        end;
        /* Otherwise add to the later month counts by checking months seperating the dates */
        else do i = 1 to 5;
            if intck("month", signup_DT, COL1) = i - 1 then som[i] + 1;
        end;
        /* Only output once we have completed a group */
        if last.signup_DT and Sign_up_count  then output;
    run;
    

    这真是太棒了,谢谢!!!我想按年度,月份,然后按产品聚合。是否可以将实际购买月份指定为var而不是注册月份(o)——注册月份4 var?是否可以将产品var不连接起来,以便我可以确定注册后购买了哪些产品?我不确定您描述的输出是什么样子的。但您可以使用
    output
    语句而不是
    catx
    来获取注册时购买的每个产品的行。