R 具有子集和基线规范的回归+;不同变量
我想在基线规范上回归一个变量,然后再回归七个附加变量(即8个回归)。我想对data.frame的两个子集和附加变量的两个子集这样做 然后,我想将这些8x2x2=32回归的输出按子集组合(SO4个文件)保存在R 具有子集和基线规范的回归+;不同变量,r,regression,R,Regression,我想在基线规范上回归一个变量,然后再回归七个附加变量(即8个回归)。我想对data.frame的两个子集和附加变量的两个子集这样做 然后,我想将这些8x2x2=32回归的输出按子集组合(SO4个文件)保存在stargazer中。可以想象,这是一项巨大的打字工作。因此,一些答案与此相关(例如,使用ddply),但我很难将两者结合起来,尤其是在每次回归中基线变量保持不变这一事实 以下是我将基线变量(控制)和附加变量的数量减少到两个的数据: Two.Year <- 1:4 Length <
stargazer
中。可以想象,这是一项巨大的打字工作。因此,一些答案与此相关(例如,使用ddply
),但我很难将两者结合起来,尤其是在每次回归中基线变量保持不变这一事实
以下是我将基线变量(控制)和附加变量的数量减少到两个的数据:
Two.Year <- 1:4
Length <- 4:8
NumAck <- 8:12
degree_max <- 15:19
degree_median <- 16:20
katz_max <- 19:23
katz_median <- 23:27
Year <- rep(c("early","late"), each=2)
Master <-as.data.frame(cbind(
Two.Year, Length, NumAck, degree_max, degree_median, katz_max, katz_median, Year
))
第二个和第三个变量都以\u max
结尾,因此lm(2.Year~Length+NumAck+degree\u max,Data=Master)
和lm(2.Year~Length+NumAck+katz_max,Data=Master)
。这给出了第二个子集,定义为所有以\u max
结尾的变量和那些以\u median
结尾的变量。到目前为止,我用grepl(“\u median”,names(Master))
和grepl(“\u max”,names(Master))
提取这些变量
如前所述,我想保存分组子集的输出,即(I)早期和最大值,(II)早期和中位数,(III)晚期和最大值,以及(IV)晚期和中位数的所有回归
到目前为止我试过了
Master.subset <- split(Master, Master$time)
ols <- ddply( Master[ Master$time %in% c('early','late'), ], "time",
function(Master) coefficients(lm(Two.Year~., data=Master)))
Master.subset这是我最后做的
首先,我定义了用于三个循环的变量:
# Write baseline model
baseline <- "Two.Year ~ Length + numAck"
# Write the model specific variables
measure <- c("degree", "katz")
# Write variable which determines the subset
timepoints <- c("early", "late")
# Write baseline model
baseline <- "Two.Year ~ Length + numAck"
# Write the model specific variables
measure <- c("degree", "katz")
# Write variable which determines the subset
timepoints <- c("early", "late")
# Output matrix
ols <- matrix()
# First loop for the time
for (tmpnt in considered_time) {
# Estimate baseline model which is constant in tmpnt
ols$baseline <- lm(as.formula(baseline),
data=subset(Master, timepoint==tmpnt)
)
ols <- ols[-1] # for some reason the first column in ols is empty
# Second loop for the variable subset
for (type in c("median", "max", "mean")) {
# Third loop for the estimation of all the other models
for (msr in measures) {
ols[[msr]] = lm(as.formula(paste(baseline, paste(msr, type, sep="_"), sep="+")),
data=subset(Master, timepoint==tmpnt)
)
}
# Write output to file
stargazer(ols,
title=paste("Regression output for ",tmpnt," subsample using ",type),
)
}
}