SAS proc genmod,具有聚集、多重插补数据

SAS proc genmod,具有聚集、多重插补数据,sas,cluster-analysis,imputation,correlated,Sas,Cluster Analysis,Imputation,Correlated,我试图通过SAS Proc Genmod使用对数二项回归从SAS中的多重插补、聚类相关数据中获得风险比估计值。我已经能够计算原始(非MI)数据的风险比估计值,但似乎该程序在生成输出数据集以便我读入Proc Mianalyze时遇到了障碍 我包括一个重复的主题声明,以便SAS将使用稳健方差估计。如果没有“repeatedsubjects”语句,ODS输出语句似乎工作得很好;然而,一旦我包含了“repeatedsubjects”语句,我就会收到一条警告消息,表明我的输出数据集没有生成 如果genmo

我试图通过SAS Proc Genmod使用对数二项回归从SAS中的多重插补、聚类相关数据中获得风险比估计值。我已经能够计算原始(非MI)数据的风险比估计值,但似乎该程序在生成输出数据集以便我读入Proc Mianalyze时遇到了障碍

我包括一个重复的主题声明,以便SAS将使用稳健方差估计。如果没有“repeatedsubjects”语句,ODS输出语句似乎工作得很好;然而,一旦我包含了“repeatedsubjects”语句,我就会收到一条警告消息,表明我的输出数据集没有生成

如果genmod/mianalyze组合不合适,我愿意接受其他方法和建议,以使用此数据生成风险比率估计值,但我想看看我是否能使其起作用!如果可能的话,我更喜欢SAS,因为其他程序(如Stata和SUDAAN)的许可证访问问题。我的代码如下,其中“血清”是我的二项结果,“int”是感兴趣的二项自变量(接受干预与未接受干预),“tf5”是二项协变量,年龄是连续协变量,村庄指定了聚类:

Proc GenMod data=sc.wide_mip descending ; by _Imputation_;
Class int (ref='0') tf5 (ref='0') village /param=ref ;
weight weight;
Model seroP= int tf5 age  / 
dist=bin Link=logit ;
repeated subject=village/ type=unstr;
estimate 'Beta' int 1 -1/exp;
ods output ParameterEstimates=sc.seroP;
Run;

proc mianalyze parms =sc.seroP;
class int  tf5  ;
modeleffects int tf5 age village  ;
run;

谢谢你的帮助

简单的答案是在“Repeated”语句的末尾添加一个选项“PRINTMLE”。但是你在这里发布的代码可能不会产生你真正想要的东西。因此,以下是一个较长的答案:

1.以下程序基于适用于Windows的SAS 9.3(或更新版本)。如果您使用的是旧版本,则编码可能会有所不同

2.对于PROC MIANALYZE,需要PROC GENMOD的三个ODS表,而不是一个,即1)参数估计表(_est);2) 协方差表(covb);参数索引表(parminfo)。PROC MIANALYZE语句的第一行应该如下所示:

PROC MIANALYZE parms = ~_est covb = ~_covb parminfo=parminfo;
而~_est指的是ODS参数表~_covb指的是ODS协方差表

有不同类型的ODS参数估计和协方差表。符号“~”应替换为一组特定的ODS表格,这将在下一部分讨论

3.从PROC GENMOD,可以生成三组不同的ODS参数和协方差表

3a)第一组表格来自非重复模型(即没有“重复”语句)。在您的情况下,它看起来像:

Proc GenMod data=sc.wide_mip descending ; by _Imputation_;
…
MODEL seroP= int tf5 age/dist=bin Link=logit COVB; /*adding the option COVB*/
/*repeated subject=village/ type=unstr;*/ 
/*Note that the above line has been changed to comments*/
…
ODS OUTPUT  
    /*the estimates from a non-repeated model*/
    ParameterEstimates=norepeat_est
    /*the covariance from a non-repeated model*/ 
    Covb = nonrepeat_covb 
    /*the indices of the parameters*/
    ParmInfo=parminfo;
Run;
…
MODEL seroP= int tf5 age/dist=bin Link=logit;
REPEATED subject=village/ type=un MODELSE MCOVB;/*options added*/
…
ODS OUTPUT 
    /*the model-based estimates from a repeated model*/
    GEEModPEst=mod_est
    /*the model-based covariance from a repeated model*/ 
    GEENCov= mod_covb 
    /*the indices of the parameters*/
    parminfo=parminfo;
Run;
…
MODEL seroP= int tf5 age/dist=bin Link=logit;
REPEATED subject=village/ type=un ECOVB;/*option changed*/
…
ODS OUTPUT 
    /*the empirical(ROBUST) estimates from a repeated model*/
    GEEEmpPEst=emp_est
    /*the empirical(ROBUST) covariance from a repeated model*/ 
    GEERCov= emp_covb 
    /*the indices of the parameters*/
    parminfo=parminfo;
Run;
值得注意的是,1)在模型语句中添加选项COVB,以获得ODS协方差表。2) “重复”的陈述作为评论。3) “~\u est”表名为“nonrepeat\u est”。类似地,表“~\u covb”被命名为“nonrepeat\u covb”

3b)第二组表格包含重复模型的基于模型的估计值。在您的例子中,它看起来像:

Proc GenMod data=sc.wide_mip descending ; by _Imputation_;
…
MODEL seroP= int tf5 age/dist=bin Link=logit COVB; /*adding the option COVB*/
/*repeated subject=village/ type=unstr;*/ 
/*Note that the above line has been changed to comments*/
…
ODS OUTPUT  
    /*the estimates from a non-repeated model*/
    ParameterEstimates=norepeat_est
    /*the covariance from a non-repeated model*/ 
    Covb = nonrepeat_covb 
    /*the indices of the parameters*/
    ParmInfo=parminfo;
Run;
…
MODEL seroP= int tf5 age/dist=bin Link=logit;
REPEATED subject=village/ type=un MODELSE MCOVB;/*options added*/
…
ODS OUTPUT 
    /*the model-based estimates from a repeated model*/
    GEEModPEst=mod_est
    /*the model-based covariance from a repeated model*/ 
    GEENCov= mod_covb 
    /*the indices of the parameters*/
    parminfo=parminfo;
Run;
…
MODEL seroP= int tf5 age/dist=bin Link=logit;
REPEATED subject=village/ type=un ECOVB;/*option changed*/
…
ODS OUTPUT 
    /*the empirical(ROBUST) estimates from a repeated model*/
    GEEEmpPEst=emp_est
    /*the empirical(ROBUST) covariance from a repeated model*/ 
    GEERCov= emp_covb 
    /*the indices of the parameters*/
    parminfo=parminfo;
Run;
在“重复”语句中,选项MODELSE生成基于模型的参数估计,MCOVB生成基于模型的协方差。如果没有这些选项,相应的ODS表(即GEEModPEst和GEENCov)将不会生成。请注意,ODS表名称与前一种情况不同。在本例中,表为GEEModPEst和GEENCov。在前一种情况下(非重复模型),表为ParameterEstimates和COVB。此处,~_est表名为“mod_est”,表示基于模型的估计。类似地,~_covb表被命名为“mod_covb”。ParmInfo表与前一个模型中的表相同

3c)第三组包含同样来自重复模型的经验估计值。经验估计值也称为稳健估计值。听起来这里的结果就是您想要的。它看起来像:

Proc GenMod data=sc.wide_mip descending ; by _Imputation_;
…
MODEL seroP= int tf5 age/dist=bin Link=logit COVB; /*adding the option COVB*/
/*repeated subject=village/ type=unstr;*/ 
/*Note that the above line has been changed to comments*/
…
ODS OUTPUT  
    /*the estimates from a non-repeated model*/
    ParameterEstimates=norepeat_est
    /*the covariance from a non-repeated model*/ 
    Covb = nonrepeat_covb 
    /*the indices of the parameters*/
    ParmInfo=parminfo;
Run;
…
MODEL seroP= int tf5 age/dist=bin Link=logit;
REPEATED subject=village/ type=un MODELSE MCOVB;/*options added*/
…
ODS OUTPUT 
    /*the model-based estimates from a repeated model*/
    GEEModPEst=mod_est
    /*the model-based covariance from a repeated model*/ 
    GEENCov= mod_covb 
    /*the indices of the parameters*/
    parminfo=parminfo;
Run;
…
MODEL seroP= int tf5 age/dist=bin Link=logit;
REPEATED subject=village/ type=un ECOVB;/*option changed*/
…
ODS OUTPUT 
    /*the empirical(ROBUST) estimates from a repeated model*/
    GEEEmpPEst=emp_est
    /*the empirical(ROBUST) covariance from a repeated model*/ 
    GEERCov= emp_covb 
    /*the indices of the parameters*/
    parminfo=parminfo;
Run;
正如您可能已经注意到的,在“重复”语句中,该选项被更改为ECOVB。这样,将生成经验协方差表。生成经验参数估计值不需要任何东西,因为它们总是由程序生成的。ParmInfo表与前面的情况相同

4.综合起来,实际上您可以同时生成三组表。唯一的问题是,应该添加一个选项“PRINTMLE”,以便在有重复项时从非重复模型生成估计。组合程序如下所示:

Proc GenMod data=sc.wide_mip descending ; by _Imputation_;
Class int (ref='0') tf5 (ref='0') village /param=ref ;
weight weight;
Model seroP= int tf5 age  / 
dist=bin Link=logit COVB; /*COVB to have non-repeated model covariance*/
repeated subject=village/ type=UN MODELSE PRINTMLE MCOVB ECOVB;/*all options*/
estimate 'Beta' int 1 -1/exp;

ODS OUTPUT  
    /*the estimates from a non-repeated model*/
    ParameterEstimates=norepeat_est
    /*the covariance from a non-repeated model*/ 
    Covb = nonrepeat_covb 
    /*the indices of the parameters*/
    ParmInfo=parminfo

    /*the model-based estimates from a repeated model*/
    GEEModPEst=mod_est
    /*the model-based covariance from a repeated model*/ 
    GEENCov= mod_covb 

    /*the empirical(ROBUST) estimates from a repeated model*/
    GEEEmpPEst=emp_est
    /*the empirical(ROBUST) covariance from a repeated model*/ 
    GEERCov= emp_covb
    ;
 Run;

/*Analyzing non-repeated results*/
PROC MIANALYZE parms = norepeat_est covb = norepeat_covb parminfo=parminfo;
class int  tf5  ;
modeleffects int tf5 age village  ;
run;

/*Analyzing model-based results*/
PROC MIANALYZE parms = mod_est covb = mod_covb parminfo=parminfo;
class int  tf5  ;
modeleffects int tf5 age village  ;
run;

/*Analyzing empirical(ROBUST) results*/
PROC MIANALYZE parms = emp_est covb = emp_covb parminfo=parminfo;
class int  tf5  ;
modeleffects int tf5 age village  ;
run;
希望能有所帮助。进一步阅读:

  • Allison,Paul D.《使用SAS®的逻辑回归:理论与应用》,第二版(第226-234页)。版权所有©2012,SAS研究所,美国北卡罗来纳州卡里

  • 简短的回答是在“Repeated”语句的末尾添加一个选项“PRINTMLE”。但是您在此处发布的代码可能无法生成您实际需要的代码。因此,下面是一个较长的回答:

    1.以下程序基于适用于Windows的SAS 9.3(或更新版本)。如果您使用的是旧版本,则编码可能不同

    2.对于PROC MIANALYZE,需要PROC GENMOD中的三个ODS表,而不是一个,即1)参数估计表(_est);2)协方差表(_covb);以及3)参数索引表(parminfo)。PROC MIANALYZE语句的第一行应如下所示:

    PROC MIANALYZE parms = ~_est covb = ~_covb parminfo=parminfo;
    
    而~_est指的是ODS参数表~_covb指的是ODS协方差表

    ODS参数估计和协方差表有不同的类型。符号“~”应替换为一组特定的ODS表,这将在下一部分讨论

    3.从PROC GENMOD,可以生成三组不同的ODS参数和协方差表

    3a)第一组表格来自一个非重复模型(即没有“重复”语句)。在您的情况下,它看起来像:

    Proc GenMod data=sc.wide_mip descending ; by _Imputation_;
    …
    MODEL seroP= int tf5 age/dist=bin Link=logit COVB; /*adding the option COVB*/
    /*repeated subject=village/ type=unstr;*/ 
    /*Note that the above line has been changed to comments*/
    …
    ODS OUTPUT  
        /*the estimates from a non-repeated model*/
        ParameterEstimates=norepeat_est
        /*the covariance from a non-repeated model*/ 
        Covb = nonrepeat_covb 
        /*the indices of the parameters*/
        ParmInfo=parminfo;
    Run;
    
    …
    MODEL seroP= int tf5 age/dist=bin Link=logit;
    REPEATED subject=village/ type=un MODELSE MCOVB;/*options added*/
    …
    ODS OUTPUT 
        /*the model-based estimates from a repeated model*/
        GEEModPEst=mod_est
        /*the model-based covariance from a repeated model*/ 
        GEENCov= mod_covb 
        /*the indices of the parameters*/
        parminfo=parminfo;
    Run;
    
    …
    MODEL seroP= int tf5 age/dist=bin Link=logit;
    REPEATED subject=village/ type=un ECOVB;/*option changed*/
    …
    ODS OUTPUT 
        /*the empirical(ROBUST) estimates from a repeated model*/
        GEEEmpPEst=emp_est
        /*the empirical(ROBUST) covariance from a repeated model*/ 
        GEERCov= emp_covb 
        /*the indices of the parameters*/
        parminfo=parminfo;
    Run;
    
    需要注意的是,1)在模型语句中添加了选项COVB,以便获得ODS cova