Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/sorting/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Sorting 在Stata代码中排序变量列表_Sorting_Stata_Data Management - Fatal编程技术网

Sorting 在Stata代码中排序变量列表

Sorting 在Stata代码中排序变量列表,sorting,stata,data-management,Sorting,Stata,Data Management,我将以下内容视为一种编程练习,而不是一种基于统计的做事方式 基本上,我想用一个预测变量运行N逻辑回归,然后为每个变量存储变量名及其卡方值。在所有预测完成后,我想显示每个预测变量,按卡方从高到低排序 到目前为止,我有以下几点: local depvar binvar1 local indepvars predvar1 predvar2 predvar3 * expand and check collinearity * _rmdcoll `depvar' `indepvars', expa

我将以下内容视为一种编程练习,而不是一种基于统计的做事方式

基本上,我想用一个预测变量运行
N
逻辑回归,然后为每个变量存储变量名及其
卡方
值。在所有预测完成后,我想显示每个预测变量,按卡方从高到低排序

到目前为止,我有以下几点:

local depvar    binvar1
local indepvars predvar1 predvar2 predvar3

* expand and check collinearity *
_rmdcoll `depvar' `indepvars', expand
local indepvars "`r(varlist)'"

* first order individual variables by best chi-squared *
local vars
local chis
foreach v in `indepvars' {
    di "RUN: logistic `depvar' `v'"
    quietly logistic `depvar' `v'

    * check if variable is not omitted (constant and iv) *
    if `e(rank)' < 2 {
        di "OMITTED (rank < 2): `v'"
        continue
    }

    * check if chi-squared is > 0 *
    if `e(chi2)' <= 0 {
        di "OMITTED (chi2 <= 0): `v'"
        continue
    }

    * store *
    local vars "`vars' `v'"
    local chis "`chis' `e(chi2)'"
    di "ADDED: `v' (chi2: `e(chi2)')"
}

* ... now sort each variable (from varlist vars) by chi2 (from varlist chis) ... * 
local ordered predvar2 3 predvar1 2 predvar3 1
然后我想得到如下结果:

local depvar    binvar1
local indepvars predvar1 predvar2 predvar3

* expand and check collinearity *
_rmdcoll `depvar' `indepvars', expand
local indepvars "`r(varlist)'"

* first order individual variables by best chi-squared *
local vars
local chis
foreach v in `indepvars' {
    di "RUN: logistic `depvar' `v'"
    quietly logistic `depvar' `v'

    * check if variable is not omitted (constant and iv) *
    if `e(rank)' < 2 {
        di "OMITTED (rank < 2): `v'"
        continue
    }

    * check if chi-squared is > 0 *
    if `e(chi2)' <= 0 {
        di "OMITTED (chi2 <= 0): `v'"
        continue
    }

    * store *
    local vars "`vars' `v'"
    local chis "`chis' `e(chi2)'"
    di "ADDED: `v' (chi2: `e(chi2)')"
}

* ... now sort each variable (from varlist vars) by chi2 (from varlist chis) ... * 
local ordered predvar2 3 predvar1 2 predvar3 1
或者,或者

local varso predvar2 predvar1 predvar3
local chiso 3 2 1

这里有一种方法

local depvar    binvar1
local indepvars predvar1 predvar2 predvar3

* expand and check collinearity *
_rmdcoll `depvar' `indepvars', expand
local indepvars "`r(varlist)'"

* first order individual variables by best chi-squared *

gen chisq = . 
gen vars = "" 
local i = 1 

foreach v in `indepvars' {
     di "RUN: logistic `depvar' `v'"
     quietly logistic `depvar' `v'

     * check if variable is not omitted (constant and iv) *
     if `e(rank)' < 2 {
          di "OMITTED (rank < 2): `v'"
     }

     * check if chi-squared is > 0 *
     else if `e(chi2)' <= 0 {
          di "OMITTED (chi2 <= 0): `v'"
     }

     * store *
     else {  
          quietly replace vars  = "`v'" in `i' 
          quietly replace chisq = -e(chi2) in `i' 
          local ++i   
          di "ADDED: `v' (chi2: `e(chi2)')"
     }
}

sort chisq
replace chisq = -chisq 
l vars chisq if chisq < ., noobs 
local-depvar-binvar1
本地indepvars predvar1 predvar2 predvar3
*展开并检查共线性*
_rmdcoll`depvar'`indepvars',展开
局部索引“`r(varlist)”
*最佳卡方检验的一阶个体变量*
gen chisq=。
gen vars=“”
局部i=1
“indepvars”中的foreach v{
di“RUN:logistic`depvar'`v'”
“depvar”“v”
*检查变量是否未省略(常量和iv)*
如果'e(秩)'小于2{
di“省略(秩<2):'v'”
}
*检查卡方是否大于0*

如果'e(chi2)'非常聪明,那么您认为是否值得在这里使用
tempfile
,而不是在主数据集中工作(例如,如果它非常大或其他什么)?在我的方法中有一个假设,即使用的变量数量不大于观察值的数量。如果这是错误的,则使用外部文件会有一定的意义,但很可能建模不会起作用。否则,我看不到使用另一个文件有任何意义。没有变量会有一些尴尬使用
gsort
可以避免否定卡方检验结果并在
sort
之后否定卡方检验结果的尴尬(
sort
将最低值放在第一位)您可以将
logit
选项添加到
\u rmcoll
命令中,以实现除多重线性外的完美分离。