Stata 在不同的数据库中将变量名与其对应的值匹配_Stata

Stata 在不同的数据库中将变量名与其对应的值匹配

stata

Stata 在不同的数据库中将变量名与其对应的值匹配,stata,Stata,我试图使用宏从外部数据集中导入变量名，将这些名称与主文件中的相应值匹配，然后使用esttab导出循环主成分分析的结果我的代码看起来像这样 preserve forvalue file = 537(3)647 { import excel "C:\Users\M\Dropbox\Masterarbeit\Stata12\test/`file'.xls", sheet("Sheet1") firstrow clear local x "" foreach var of

我试图使用宏从外部数据集中导入变量名，将这些名称与主文件中的相应值匹配，然后使用

esttab

导出循环主成分分析的结果

我的代码看起来像这样

preserve

forvalue file = 537(3)647 {

    import excel "C:\Users\M\Dropbox\Masterarbeit\Stata12\test/`file'.xls", sheet("Sheet1") firstrow clear

    local x ""
    foreach var of varlist *SA {
        local x `x' `var'
    }

    clear
    restore

    forvalue z = 537(3)647 {
        pca `x' if rMonth < `z'+3, comp(1)
        esttab e(L) using pc`z'.csv, replace
    }
}

也许这个例子有助于：

clear all
set more off

/*
load two example MS Excel files with var names only and accumulate var names in a local.
files are named varfile.xls and varfile2.xls
*/

foreach i in "" "2" {

    import excel "/home/roberto/Desktop/stata_tests/varfile`i'.xls", firstrow clear

    * get var names
    quietly ds

    * save var names in local
    local myvars `myvars' `r(varlist)'
}

* load database that contains vars and values
sysuse auto, clear

* do pca
pca `myvars'

/*
varfile.xls contains variables "weight" and "price"
varfile2.xls contains variables "mpg" and "length"
*/

ds

在这里起作用，因为它保存在MS Excel表中拾取的变量名称，并将结果存储在

r（varlist）

中。请参阅

帮助ds

和

帮助保存的结果

（或

帮助存储的结果

）。之后，我们加载一个“完整”数据库，并使用存储的变量名和

pca

MS Excel文件如下所示：

. set excelxlsxlargefile on

cd C:\Users\M\Dropbox\Masterarbeit\Stata12\sentiment_6m

. import excel "C:\Users\M\Dropbox\Masterarbeit\Daten\Dataimport\sentiments\Google Query CDX.xlsx", sheet("Tabelle1") firstrow

set more off

gen Month = month( Date)

gen     January     =   1   if  Month   ==  1
gen     February    =   1   if  Month   ==  2
gen     March   =   1   if  Month   ==  3
gen     April   =   1   if  Month   ==  4
gen     May =   1   if  Month   ==  5
gen     June    =   1   if  Month   ==  6
gen     July    =   1   if  Month   ==  7
gen     August  =   1   if  Month   ==  8
gen     September   =   1   if  Month   ==  9
gen     October =   1   if  Month   ==  10
gen     November    =   1   if  Month   ==  11
gen     December    =   1   if  Month   ==  12
replace     January     =   0   if  January     ==  .
replace     February    =   0   if  February    ==  .
replace     March   =   0   if  March   ==  .
replace     April   =   0   if  April   ==  .
replace     May =   0   if  May ==  .
replace     June    =   0   if  June    ==  .
replace     July    =   0   if  July    ==  .
replace     August  =   0   if  August  ==  .
replace     September   =   0   if  September   ==  .
replace     October =   0   if  October ==  .
replace     November    =   0   if  November    ==  .
replace     December    =   0   if  December    ==  .


foreach var of varlist *_qry{  
sum `var', meanonly
local mu =r(mean)
reg `var' January  February March April May June July August September October November December, nocons
predict double `var'SA, residual
replace `var'SA=`var'SA+`mu'
egen sd = sd(`var'SA)
replace `var'SA=`var'SA/sd
drop sd
drop `var'
}



* BIG LOOP *

generate double rMonth = mofd( Date)
global tflist ""

forvalue y = 537(3)647{


foreach var of varlist *SA{
reg MidCDX `var' if rMonth<=`y'
tempfile tfcur
parmest, idstr("`var'") saving(`"`tfcur'"', replace) flis(tflist) 
}


* Concatenate files into memory (REPLACING THE OLD DATA) *
preserve
clear
append using $tflist
sencode idstr, gene(xvar)
lab var xvar "X-variable"
keybygen xvar, gene(parmseq)
drop if parm=="_cons"
egen rank = rank (-t)
gsort -t
drop if rank>40
save `y', replace
export excel xvar t using `y', firstrow(variables) replace
foreach TF in $tflist {
erase `"`TF'"'
}
global tflist ""
restore

}

我认为，这回答了你提出的具体问题

编辑仔细查看您的代码，我不确定问题是否与匹配完整数据库中的变量名有关，而是与设置

preserve

和

restore

的方式有关。不要使用这组命令，而是尝试在需要时加载整个数据库（使用

use

）

在保存之前，您有什么？你的错误出现在哪里？请发布更多代码。一个可复制的例子会有所帮助

编辑2 我现在的猜测是，在

保存

之前，您什么都没有，因此当您

恢复

时，您只是在清理历史记录；您正在还原一个空白数据库。因此，尝试

pca

可以提供：

no variables defined
r(111);

preserve

将数据保留为命令发出前的状态。

此示例可能有助于：

clear all
set more off

/*
load two example MS Excel files with var names only and accumulate var names in a local.
files are named varfile.xls and varfile2.xls
*/

foreach i in "" "2" {

    import excel "/home/roberto/Desktop/stata_tests/varfile`i'.xls", firstrow clear

    * get var names
    quietly ds

    * save var names in local
    local myvars `myvars' `r(varlist)'
}

* load database that contains vars and values
sysuse auto, clear

* do pca
pca `myvars'

/*
varfile.xls contains variables "weight" and "price"
varfile2.xls contains variables "mpg" and "length"
*/

ds

在这里起作用，因为它保存在MS Excel表中拾取的变量名称，并将结果存储在

r（varlist）

中。请参阅

帮助ds

和

帮助保存的结果

（或

帮助存储的结果

）。之后，我们加载一个“完整”数据库，并使用存储的变量名和

pca

MS Excel文件如下所示：

. set excelxlsxlargefile on

cd C:\Users\M\Dropbox\Masterarbeit\Stata12\sentiment_6m

. import excel "C:\Users\M\Dropbox\Masterarbeit\Daten\Dataimport\sentiments\Google Query CDX.xlsx", sheet("Tabelle1") firstrow

set more off

gen Month = month( Date)

gen     January     =   1   if  Month   ==  1
gen     February    =   1   if  Month   ==  2
gen     March   =   1   if  Month   ==  3
gen     April   =   1   if  Month   ==  4
gen     May =   1   if  Month   ==  5
gen     June    =   1   if  Month   ==  6
gen     July    =   1   if  Month   ==  7
gen     August  =   1   if  Month   ==  8
gen     September   =   1   if  Month   ==  9
gen     October =   1   if  Month   ==  10
gen     November    =   1   if  Month   ==  11
gen     December    =   1   if  Month   ==  12
replace     January     =   0   if  January     ==  .
replace     February    =   0   if  February    ==  .
replace     March   =   0   if  March   ==  .
replace     April   =   0   if  April   ==  .
replace     May =   0   if  May ==  .
replace     June    =   0   if  June    ==  .
replace     July    =   0   if  July    ==  .
replace     August  =   0   if  August  ==  .
replace     September   =   0   if  September   ==  .
replace     October =   0   if  October ==  .
replace     November    =   0   if  November    ==  .
replace     December    =   0   if  December    ==  .


foreach var of varlist *_qry{  
sum `var', meanonly
local mu =r(mean)
reg `var' January  February March April May June July August September October November December, nocons
predict double `var'SA, residual
replace `var'SA=`var'SA+`mu'
egen sd = sd(`var'SA)
replace `var'SA=`var'SA/sd
drop sd
drop `var'
}



* BIG LOOP *

generate double rMonth = mofd( Date)
global tflist ""

forvalue y = 537(3)647{


foreach var of varlist *SA{
reg MidCDX `var' if rMonth<=`y'
tempfile tfcur
parmest, idstr("`var'") saving(`"`tfcur'"', replace) flis(tflist) 
}


* Concatenate files into memory (REPLACING THE OLD DATA) *
preserve
clear
append using $tflist
sencode idstr, gene(xvar)
lab var xvar "X-variable"
keybygen xvar, gene(parmseq)
drop if parm=="_cons"
egen rank = rank (-t)
gsort -t
drop if rank>40
save `y', replace
export excel xvar t using `y', firstrow(variables) replace
foreach TF in $tflist {
erase `"`TF'"'
}
global tflist ""
restore

}

我认为，这回答了你提出的具体问题

编辑仔细查看您的代码，我不确定问题是否与匹配完整数据库中的变量名有关，而是与设置

preserve

和

restore

的方式有关。不要使用这组命令，而是尝试在需要时加载整个数据库（使用

use

）

在保存之前，您有什么？你的错误出现在哪里？请发布更多代码。一个可复制的例子会有所帮助

编辑2 我现在的猜测是，在

保存

之前，您什么都没有，因此当您

恢复

时，您只是在清理历史记录；您正在还原一个空白数据库。因此，尝试

pca

可以提供：

no variables defined
r(111);

preserve

将数据保持在命令发出前的状态。

@Roberto ferer正在解决您的主要问题，这取决于在文件之间比较变量名。我添加了有关使用本地宏和通配符语法的详细信息

local x ""
foreach var of varlist *SA {
    local x `x' `var'
}

还有很长的路要走

unab x : *SA

@Roberto Ferrer正在解决您的主要问题，这取决于在文件之间比较变量名。我添加了有关使用本地宏和通配符语法的详细信息

local x ""
foreach var of varlist *SA {
    local x `x' `var'
}

还有很长的路要走

unab x : *SA

个人评论：这里有太多的代码，我不想尝试和吸收你正在尝试做的事情。我仅就技术的一些细节发表评论

这段代码


如果月份=1，则1月份=1
如果月份=2，则2月份=1
如果月份=3，则3月份=1
如果月份=4，则4月份=1
如果月份=5，则发电机可能=1
如果月份=6，则6月份=1
如果月份=7，则7月份=1
如果月份=8，则8月份=1
如果月份=9，则9月份=1
如果月份=10，则10月份=1
如果月份=11，则11月份=1
如果月份=12，则12月份=1
如果一月==，则替换一月=0。
如果二月==，则替换二月=0。
如果三月==，则替换三月=0。
如果April==，则替换April=0。
如果五月==，则替换五月=0。
如果June==，则替换June=0。
如果七月==，则替换七月=0。
如果八月==，则替换八月=0。
如果九月==，则替换九月=0。
如果十月==，则替换十月=0。
如果十一月==，则替换十一月=0。
如果十二月==，则替换十二月=0。

可以这样重写吗

tokenize "`c(Months)'"
forval j = 1/12 { 
    gen ``j'' = Month == `j' 
}

一月到十二月的月份名称连接到

c（月）

可以缩短为

reg `var' January-December, nocons
predict double `var'SA, residual
sum `var' 
replace `var'SA = (`var'SA + r(mean)) / r(sd)

请注意，创建一个只包含SD的完整变量不是一个好主意。这抵消了使用

summary（仅指）节省的时间
我不会在这里评论你试图做什么统计，加上平均值，然后除以SD
 个人评论：这里的代码太多了，我不想尝试和理解您正在尝试做的事情。我仅就技术的一些细节发表评论
这段代码

如果月份=1，则1月份=1
如果月份=2，则2月份=1
如果月份=3，则3月份=1
如果月份=4，则4月份=1
如果月份=5，则发电机可能=1
如果月份=6，则6月份=1
如果月份=7，则7月份=1
如果月份=8，则8月份=1
如果月份=9，则9月份=1
如果月份=10，则10月份=1
如果月份=11，则11月份=1
如果月份=12，则12月份=1
如果一月==，则替换一月=0。
取代二月