Sas 从分类变量生成虚拟变量

Sas 从分类变量生成虚拟变量,sas,stata,dummy-variable,Sas,Stata,Dummy Variable,如何在SAS中为数据集中每个分类变量的每个值创建虚拟变量(编码为0或1)?因为我有很多变量,所以我想做一些类似循环的事情 在Stata中,我将使用以下代码: foreach var of varlist var1 var2 var3 var4 var5 var6 var7 { tabulate `var', gen(`var') drop `var' } 您可以尝试几种数据转换技术 一种方法是使用Proc tablate将所有不同的变量值整理成用于生成变量名的数据集 %macr

如何在SAS中为数据集中每个分类变量的每个值创建虚拟变量(编码为0或1)?因为我有很多变量,所以我想做一些类似循环的事情

在Stata中,我将使用以下代码:

foreach var of varlist var1 var2 var3 var4 var5 var6 var7 {
    tabulate `var', gen(`var')
    drop `var'
}

您可以尝试几种数据转换技术

一种方法是使用
Proc tablate
将所有不同的变量值整理成用于生成变量名的数据集

%macro DummyVariables(data=, var=, out=, genmode=);

  proc tabulate data=&data out=class_values; 
    class &var;
    table &var;
  run;

  %local dsid index seq &var _type_ varname varvalue p_type_;

  %let p_type_ = 0;

  %let dsid = %sysfunc(open(class_values));

  %syscall SET ( dsid );

  data &out;
    set &data;

      length

      %if &genmode=1 %then %do;
        %do index = 1 %to %sysfunc(ATTRN(&dsid,NOBS));
          dummy_&index
        %end;
      %end;

      %if &genmode=2 %then %do;
        %do %while (0 = %sysfunc(fetch(&dsid)));
          %let index = %sysfunc(index(&_type_, 1));
          %let varname = %scan(&var,&index);
          %if &_type_ ne &p_type_ %then %let seq=1; %else %let seq=%eval(&seq+1);
          &varname._&seq
          %let p_type_ = &_type_;
        %end;
      %end;

      %if &genmode=3 %then %do;
        %do %while (0 = %sysfunc(fetch(&dsid)));
          %let index = %sysfunc(index(&_type_, 1));
          %let varname = %scan(&var,&index);
          %let varvalue = &&&varname;
          "&varname._&varvalue"n
        %end;
      %end;
      4;
  run;

  %let dsid = %sysfunc(close(&dsid));

%mend;


%DummyVariables (data=sashelp.class, var=name age sex height weight, out=want1, genmode=1);
%DummyVariables (data=sashelp.class, var=name age sex height weight, out=want2, genmode=2);

options validvarname = any;
%DummyVariables (data=sashelp.class, var=name age sex height weight, out=want3, genmode=3);
NOTE: Variable name_1 is uninitialized.
...
NOTE: Variable name_19 is uninitialized.
NOTE: Variable age_1 is uninitialized.
...

NOTE: Variable age_6 is uninitialized.
NOTE: Variable sex_1 is uninitialized.
NOTE: Variable sex_2 is uninitialized.
NOTE: Variable height_1 is uninitialized.
...
NOTE: Variable height_17 is uninitialized.
NOTE: Variable weight_1 is uninitialized.
etc ...
模式1一个数字后缀虚拟变量,用于每个不同的变量和值

NOTE: Variable dummy_1 is uninitialized.
NOTE: Variable dummy_2 is uninitialized.
...
NOTE: Variable dummy_58 is uninitialized.
NOTE: Variable dummy_59 is uninitialized.
模式2一个数字后缀虚拟变量,用于每个不同的变量和值。基于原始变量名的虚拟变量名

%macro DummyVariables(data=, var=, out=, genmode=);

  proc tabulate data=&data out=class_values; 
    class &var;
    table &var;
  run;

  %local dsid index seq &var _type_ varname varvalue p_type_;

  %let p_type_ = 0;

  %let dsid = %sysfunc(open(class_values));

  %syscall SET ( dsid );

  data &out;
    set &data;

      length

      %if &genmode=1 %then %do;
        %do index = 1 %to %sysfunc(ATTRN(&dsid,NOBS));
          dummy_&index
        %end;
      %end;

      %if &genmode=2 %then %do;
        %do %while (0 = %sysfunc(fetch(&dsid)));
          %let index = %sysfunc(index(&_type_, 1));
          %let varname = %scan(&var,&index);
          %if &_type_ ne &p_type_ %then %let seq=1; %else %let seq=%eval(&seq+1);
          &varname._&seq
          %let p_type_ = &_type_;
        %end;
      %end;

      %if &genmode=3 %then %do;
        %do %while (0 = %sysfunc(fetch(&dsid)));
          %let index = %sysfunc(index(&_type_, 1));
          %let varname = %scan(&var,&index);
          %let varvalue = &&&varname;
          "&varname._&varvalue"n
        %end;
      %end;
      4;
  run;

  %let dsid = %sysfunc(close(&dsid));

%mend;


%DummyVariables (data=sashelp.class, var=name age sex height weight, out=want1, genmode=1);
%DummyVariables (data=sashelp.class, var=name age sex height weight, out=want2, genmode=2);

options validvarname = any;
%DummyVariables (data=sashelp.class, var=name age sex height weight, out=want3, genmode=3);
NOTE: Variable name_1 is uninitialized.
...
NOTE: Variable name_19 is uninitialized.
NOTE: Variable age_1 is uninitialized.
...

NOTE: Variable age_6 is uninitialized.
NOTE: Variable sex_1 is uninitialized.
NOTE: Variable sex_2 is uninitialized.
NOTE: Variable height_1 is uninitialized.
...
NOTE: Variable height_17 is uninitialized.
NOTE: Variable weight_1 is uninitialized.
etc ...
模式3为每个不同的变量和值添加一个值后缀虚拟变量。基于变量名和值的虚拟变量名


NOTE: Variable name_Alfred is uninitialized.
...
NOTE: Variable name_William is uninitialized.
NOTE: Variable age_11 is uninitialized.
...
NOTE: Variable age_15 is uninitialized.
NOTE: Variable age_16 is uninitialized.
NOTE: Variable sex_F is uninitialized.
NOTE: Variable sex_M is uninitialized.
NOTE: Variable 'height_51.3'n is uninitialized.
NOTE: Variable 'height_56.3'n is uninitialized.
...
NOTE: Variable height_69 is uninitialized.
NOTE: Variable height_72 is uninitialized.
NOTE: Variable 'weight_50.5'n is uninitialized.
NOTE: Variable weight_77 is uninitialized.
...
NOTE: Variable weight_133 is uninitialized.
NOTE: Variable weight_150 is uninitialized.

您可以尝试几种数据转换技术

一种方法是使用
Proc tablate
将所有不同的变量值整理成用于生成变量名的数据集

%macro DummyVariables(data=, var=, out=, genmode=);

  proc tabulate data=&data out=class_values; 
    class &var;
    table &var;
  run;

  %local dsid index seq &var _type_ varname varvalue p_type_;

  %let p_type_ = 0;

  %let dsid = %sysfunc(open(class_values));

  %syscall SET ( dsid );

  data &out;
    set &data;

      length

      %if &genmode=1 %then %do;
        %do index = 1 %to %sysfunc(ATTRN(&dsid,NOBS));
          dummy_&index
        %end;
      %end;

      %if &genmode=2 %then %do;
        %do %while (0 = %sysfunc(fetch(&dsid)));
          %let index = %sysfunc(index(&_type_, 1));
          %let varname = %scan(&var,&index);
          %if &_type_ ne &p_type_ %then %let seq=1; %else %let seq=%eval(&seq+1);
          &varname._&seq
          %let p_type_ = &_type_;
        %end;
      %end;

      %if &genmode=3 %then %do;
        %do %while (0 = %sysfunc(fetch(&dsid)));
          %let index = %sysfunc(index(&_type_, 1));
          %let varname = %scan(&var,&index);
          %let varvalue = &&&varname;
          "&varname._&varvalue"n
        %end;
      %end;
      4;
  run;

  %let dsid = %sysfunc(close(&dsid));

%mend;


%DummyVariables (data=sashelp.class, var=name age sex height weight, out=want1, genmode=1);
%DummyVariables (data=sashelp.class, var=name age sex height weight, out=want2, genmode=2);

options validvarname = any;
%DummyVariables (data=sashelp.class, var=name age sex height weight, out=want3, genmode=3);
NOTE: Variable name_1 is uninitialized.
...
NOTE: Variable name_19 is uninitialized.
NOTE: Variable age_1 is uninitialized.
...

NOTE: Variable age_6 is uninitialized.
NOTE: Variable sex_1 is uninitialized.
NOTE: Variable sex_2 is uninitialized.
NOTE: Variable height_1 is uninitialized.
...
NOTE: Variable height_17 is uninitialized.
NOTE: Variable weight_1 is uninitialized.
etc ...
模式1一个数字后缀虚拟变量,用于每个不同的变量和值

NOTE: Variable dummy_1 is uninitialized.
NOTE: Variable dummy_2 is uninitialized.
...
NOTE: Variable dummy_58 is uninitialized.
NOTE: Variable dummy_59 is uninitialized.
模式2一个数字后缀虚拟变量,用于每个不同的变量和值。基于原始变量名的虚拟变量名

%macro DummyVariables(data=, var=, out=, genmode=);

  proc tabulate data=&data out=class_values; 
    class &var;
    table &var;
  run;

  %local dsid index seq &var _type_ varname varvalue p_type_;

  %let p_type_ = 0;

  %let dsid = %sysfunc(open(class_values));

  %syscall SET ( dsid );

  data &out;
    set &data;

      length

      %if &genmode=1 %then %do;
        %do index = 1 %to %sysfunc(ATTRN(&dsid,NOBS));
          dummy_&index
        %end;
      %end;

      %if &genmode=2 %then %do;
        %do %while (0 = %sysfunc(fetch(&dsid)));
          %let index = %sysfunc(index(&_type_, 1));
          %let varname = %scan(&var,&index);
          %if &_type_ ne &p_type_ %then %let seq=1; %else %let seq=%eval(&seq+1);
          &varname._&seq
          %let p_type_ = &_type_;
        %end;
      %end;

      %if &genmode=3 %then %do;
        %do %while (0 = %sysfunc(fetch(&dsid)));
          %let index = %sysfunc(index(&_type_, 1));
          %let varname = %scan(&var,&index);
          %let varvalue = &&&varname;
          "&varname._&varvalue"n
        %end;
      %end;
      4;
  run;

  %let dsid = %sysfunc(close(&dsid));

%mend;


%DummyVariables (data=sashelp.class, var=name age sex height weight, out=want1, genmode=1);
%DummyVariables (data=sashelp.class, var=name age sex height weight, out=want2, genmode=2);

options validvarname = any;
%DummyVariables (data=sashelp.class, var=name age sex height weight, out=want3, genmode=3);
NOTE: Variable name_1 is uninitialized.
...
NOTE: Variable name_19 is uninitialized.
NOTE: Variable age_1 is uninitialized.
...

NOTE: Variable age_6 is uninitialized.
NOTE: Variable sex_1 is uninitialized.
NOTE: Variable sex_2 is uninitialized.
NOTE: Variable height_1 is uninitialized.
...
NOTE: Variable height_17 is uninitialized.
NOTE: Variable weight_1 is uninitialized.
etc ...
模式3为每个不同的变量和值添加一个值后缀虚拟变量。基于变量名和值的虚拟变量名


NOTE: Variable name_Alfred is uninitialized.
...
NOTE: Variable name_William is uninitialized.
NOTE: Variable age_11 is uninitialized.
...
NOTE: Variable age_15 is uninitialized.
NOTE: Variable age_16 is uninitialized.
NOTE: Variable sex_F is uninitialized.
NOTE: Variable sex_M is uninitialized.
NOTE: Variable 'height_51.3'n is uninitialized.
NOTE: Variable 'height_56.3'n is uninitialized.
...
NOTE: Variable height_69 is uninitialized.
NOTE: Variable height_72 is uninitialized.
NOTE: Variable 'weight_50.5'n is uninitialized.
NOTE: Variable weight_77 is uninitialized.
...
NOTE: Variable weight_133 is uninitialized.
NOTE: Variable weight_150 is uninitialized.

请提供适合SAS用户的示例数据。这对您有帮助吗?虚拟变量是否也需要填充
0
1
?您希望虚拟变量在哪里?在矩阵或数据集中?为什么需要虚拟变量?对于大多数SAS过程,您可以告诉它将现有变量视为类变量,它将在其内部矩阵操作中自动创建虚拟列。请提供适合SAS用户的示例数据。这对您有帮助吗?虚拟变量是否也需要填充
0
1
?您希望虚拟变量在哪里?在矩阵或数据集中?为什么需要虚拟变量?对于大多数SAS过程,您只需告诉它将现有变量视为类变量,它就会在其内部矩阵操作中自动创建虚拟列。