Stata 在不同的数据库中将变量名与其对应的值匹配
我试图使用宏从外部数据集中导入变量名,将这些名称与主文件中的相应值匹配,然后使用Stata 在不同的数据库中将变量名与其对应的值匹配,stata,Stata,我试图使用宏从外部数据集中导入变量名,将这些名称与主文件中的相应值匹配,然后使用esttab导出循环主成分分析的结果 我的代码看起来像这样 preserve forvalue file = 537(3)647 { import excel "C:\Users\M\Dropbox\Masterarbeit\Stata12\test/`file'.xls", sheet("Sheet1") firstrow clear local x "" foreach var of
esttab
导出循环主成分分析的结果
我的代码看起来像这样
preserve
forvalue file = 537(3)647 {
import excel "C:\Users\M\Dropbox\Masterarbeit\Stata12\test/`file'.xls", sheet("Sheet1") firstrow clear
local x ""
foreach var of varlist *SA {
local x `x' `var'
}
clear
restore
forvalue z = 537(3)647 {
pca `x' if rMonth < `z'+3, comp(1)
esttab e(L) using pc`z'.csv, replace
}
}
也许这个例子有助于:
clear all
set more off
/*
load two example MS Excel files with var names only and accumulate var names in a local.
files are named varfile.xls and varfile2.xls
*/
foreach i in "" "2" {
import excel "/home/roberto/Desktop/stata_tests/varfile`i'.xls", firstrow clear
* get var names
quietly ds
* save var names in local
local myvars `myvars' `r(varlist)'
}
* load database that contains vars and values
sysuse auto, clear
* do pca
pca `myvars'
/*
varfile.xls contains variables "weight" and "price"
varfile2.xls contains variables "mpg" and "length"
*/
ds
在这里起作用,因为它保存在MS Excel表中拾取的变量名称,并将结果存储在r(varlist)
中。请参阅帮助ds
和帮助保存的结果
(或帮助存储的结果
)。之后,我们加载一个“完整”数据库,并使用存储的变量名和pca
MS Excel文件如下所示:
. set excelxlsxlargefile on
cd C:\Users\M\Dropbox\Masterarbeit\Stata12\sentiment_6m
. import excel "C:\Users\M\Dropbox\Masterarbeit\Daten\Dataimport\sentiments\Google Query CDX.xlsx", sheet("Tabelle1") firstrow
set more off
gen Month = month( Date)
gen January = 1 if Month == 1
gen February = 1 if Month == 2
gen March = 1 if Month == 3
gen April = 1 if Month == 4
gen May = 1 if Month == 5
gen June = 1 if Month == 6
gen July = 1 if Month == 7
gen August = 1 if Month == 8
gen September = 1 if Month == 9
gen October = 1 if Month == 10
gen November = 1 if Month == 11
gen December = 1 if Month == 12
replace January = 0 if January == .
replace February = 0 if February == .
replace March = 0 if March == .
replace April = 0 if April == .
replace May = 0 if May == .
replace June = 0 if June == .
replace July = 0 if July == .
replace August = 0 if August == .
replace September = 0 if September == .
replace October = 0 if October == .
replace November = 0 if November == .
replace December = 0 if December == .
foreach var of varlist *_qry{
sum `var', meanonly
local mu =r(mean)
reg `var' January February March April May June July August September October November December, nocons
predict double `var'SA, residual
replace `var'SA=`var'SA+`mu'
egen sd = sd(`var'SA)
replace `var'SA=`var'SA/sd
drop sd
drop `var'
}
* BIG LOOP *
generate double rMonth = mofd( Date)
global tflist ""
forvalue y = 537(3)647{
foreach var of varlist *SA{
reg MidCDX `var' if rMonth<=`y'
tempfile tfcur
parmest, idstr("`var'") saving(`"`tfcur'"', replace) flis(tflist)
}
* Concatenate files into memory (REPLACING THE OLD DATA) *
preserve
clear
append using $tflist
sencode idstr, gene(xvar)
lab var xvar "X-variable"
keybygen xvar, gene(parmseq)
drop if parm=="_cons"
egen rank = rank (-t)
gsort -t
drop if rank>40
save `y', replace
export excel xvar t using `y', firstrow(variables) replace
foreach TF in $tflist {
erase `"`TF'"'
}
global tflist ""
restore
}
我认为,这回答了你提出的具体问题
编辑
仔细查看您的代码,我不确定问题是否与匹配完整数据库中的变量名有关,而是与设置preserve
和restore
的方式有关。不要使用这组命令,而是尝试在需要时加载整个数据库(使用use
)
在保存之前,您有什么?你的错误出现在哪里?请发布更多代码。一个可复制的例子会有所帮助
编辑2
我现在的猜测是,在保存
之前,您什么都没有,因此当您恢复
时,您只是在清理历史记录;您正在还原一个空白数据库。因此,尝试pca
可以提供:
no variables defined
r(111);
preserve
将数据保留为命令发出前的状态。此示例可能有助于:
clear all
set more off
/*
load two example MS Excel files with var names only and accumulate var names in a local.
files are named varfile.xls and varfile2.xls
*/
foreach i in "" "2" {
import excel "/home/roberto/Desktop/stata_tests/varfile`i'.xls", firstrow clear
* get var names
quietly ds
* save var names in local
local myvars `myvars' `r(varlist)'
}
* load database that contains vars and values
sysuse auto, clear
* do pca
pca `myvars'
/*
varfile.xls contains variables "weight" and "price"
varfile2.xls contains variables "mpg" and "length"
*/
ds
在这里起作用,因为它保存在MS Excel表中拾取的变量名称,并将结果存储在r(varlist)
中。请参阅帮助ds
和帮助保存的结果
(或帮助存储的结果
)。之后,我们加载一个“完整”数据库,并使用存储的变量名和pca
MS Excel文件如下所示:
. set excelxlsxlargefile on
cd C:\Users\M\Dropbox\Masterarbeit\Stata12\sentiment_6m
. import excel "C:\Users\M\Dropbox\Masterarbeit\Daten\Dataimport\sentiments\Google Query CDX.xlsx", sheet("Tabelle1") firstrow
set more off
gen Month = month( Date)
gen January = 1 if Month == 1
gen February = 1 if Month == 2
gen March = 1 if Month == 3
gen April = 1 if Month == 4
gen May = 1 if Month == 5
gen June = 1 if Month == 6
gen July = 1 if Month == 7
gen August = 1 if Month == 8
gen September = 1 if Month == 9
gen October = 1 if Month == 10
gen November = 1 if Month == 11
gen December = 1 if Month == 12
replace January = 0 if January == .
replace February = 0 if February == .
replace March = 0 if March == .
replace April = 0 if April == .
replace May = 0 if May == .
replace June = 0 if June == .
replace July = 0 if July == .
replace August = 0 if August == .
replace September = 0 if September == .
replace October = 0 if October == .
replace November = 0 if November == .
replace December = 0 if December == .
foreach var of varlist *_qry{
sum `var', meanonly
local mu =r(mean)
reg `var' January February March April May June July August September October November December, nocons
predict double `var'SA, residual
replace `var'SA=`var'SA+`mu'
egen sd = sd(`var'SA)
replace `var'SA=`var'SA/sd
drop sd
drop `var'
}
* BIG LOOP *
generate double rMonth = mofd( Date)
global tflist ""
forvalue y = 537(3)647{
foreach var of varlist *SA{
reg MidCDX `var' if rMonth<=`y'
tempfile tfcur
parmest, idstr("`var'") saving(`"`tfcur'"', replace) flis(tflist)
}
* Concatenate files into memory (REPLACING THE OLD DATA) *
preserve
clear
append using $tflist
sencode idstr, gene(xvar)
lab var xvar "X-variable"
keybygen xvar, gene(parmseq)
drop if parm=="_cons"
egen rank = rank (-t)
gsort -t
drop if rank>40
save `y', replace
export excel xvar t using `y', firstrow(variables) replace
foreach TF in $tflist {
erase `"`TF'"'
}
global tflist ""
restore
}
我认为,这回答了你提出的具体问题
编辑
仔细查看您的代码,我不确定问题是否与匹配完整数据库中的变量名有关,而是与设置preserve
和restore
的方式有关。不要使用这组命令,而是尝试在需要时加载整个数据库(使用use
)
在保存之前,您有什么?你的错误出现在哪里?请发布更多代码。一个可复制的例子会有所帮助
编辑2
我现在的猜测是,在保存
之前,您什么都没有,因此当您恢复
时,您只是在清理历史记录;您正在还原一个空白数据库。因此,尝试pca
可以提供:
no variables defined
r(111);
preserve
将数据保持在命令发出前的状态。@Roberto ferer正在解决您的主要问题,这取决于在文件之间比较变量名。我添加了有关使用本地宏和通配符语法的详细信息
local x ""
foreach var of varlist *SA {
local x `x' `var'
}
还有很长的路要走
unab x : *SA
@Roberto Ferrer正在解决您的主要问题,这取决于在文件之间比较变量名。我添加了有关使用本地宏和通配符语法的详细信息
local x ""
foreach var of varlist *SA {
local x `x' `var'
}
还有很长的路要走
unab x : *SA
个人评论:这里有太多的代码,我不想尝试和吸收你正在尝试做的事情。我仅就技术的一些细节发表评论 这段代码
如果月份=1,则1月份=1
如果月份=2,则2月份=1
如果月份=3,则3月份=1
如果月份=4,则4月份=1
如果月份=5,则发电机可能=1
如果月份=6,则6月份=1
如果月份=7,则7月份=1
如果月份=8,则8月份=1
如果月份=9,则9月份=1
如果月份=10,则10月份=1
如果月份=11,则11月份=1
如果月份=12,则12月份=1
如果一月==,则替换一月=0。
如果二月==,则替换二月=0。
如果三月==,则替换三月=0。
如果April==,则替换April=0。
如果五月==,则替换五月=0。
如果June==,则替换June=0。
如果七月==,则替换七月=0。
如果八月==,则替换八月=0。
如果九月==,则替换九月=0。
如果十月==,则替换十月=0。
如果十一月==,则替换十一月=0。
如果十二月==,则替换十二月=0。
可以这样重写吗
tokenize "`c(Months)'"
forval j = 1/12 {
gen ``j'' = Month == `j'
}
一月到十二月的月份名称连接到c(月)
可以缩短为
reg `var' January-December, nocons
predict double `var'SA, residual
sum `var'
replace `var'SA = (`var'SA + r(mean)) / r(sd)
请注意,创建一个只包含SD的完整变量不是一个好主意。这抵消了使用summary(仅指)节省的时间
我不会在这里评论你试图做什么统计,加上平均值,然后除以SD 个人评论:这里的代码太多了,我不想尝试和理解您正在尝试做的事情。我仅就技术的一些细节发表评论
这段代码
如果月份=1,则1月份=1
如果月份=2,则2月份=1
如果月份=3,则3月份=1
如果月份=4,则4月份=1
如果月份=5,则发电机可能=1
如果月份=6,则6月份=1
如果月份=7,则7月份=1
如果月份=8,则8月份=1
如果月份=9,则9月份=1
如果月份=10,则10月份=1
如果月份=11,则11月份=1
如果月份=12,则12月份=1
如果一月==,则替换一月=0。
取代二月