R/plm按指数提取残差
我使用以下方法创建了一个plm对象:R/plm按指数提取残差,r,plm,pseries,R,Plm,Pseries,我使用以下方法创建了一个plm对象: require(plm) plm1 <- plm(Sepal.Length ~ Petal.Length + Petal.Width, data = iris, index = "Species") 如果我有这样的东西就好了: > df1 <- data.frame(time = rep(1:10,15), Species = iris$Species, resid1 = runif(150)) > head(df1) time
require(plm)
plm1 <- plm(Sepal.Length ~ Petal.Length + Petal.Width, data = iris, index = "Species")
如果我有这样的东西就好了:
> df1 <- data.frame(time = rep(1:10,15), Species = iris$Species, resid1 = runif(150))
> head(df1)
time Species resid1
1 1 setosa 0.7038776
2 2 setosa 0.2164597
3 3 setosa 0.1988884
4 4 setosa 0.9311872
5 5 setosa 0.7087211
6 6 setosa 0.9914357
>df1头(df1)
物种残留时间1
1 setosa 0.7038776
2 setosa 0.2164597
3 3 setosa 0.1988884
4刚毛0.9311872
5刚毛0.7087211
6 setosa 0.9914357
我可以使用ddply或聚合来找到每个物种的RSquare
有什么建议吗?可能是这样的,这样做可以奏效
library(plm)
plm1 <- plm(Sepal.Length ~ Petal.Length + Petal.Width, data = iris, index = "Species")
res <- residuals(plm1)
df <- cbind(as.vector(res), attr(res, "index"))
names(df) <- c("resid", "species", "time")
str(df)
## 'data.frame': 150 obs. of 3 variables:
## $ resid : num 0.1499 -0.0501 -0.1595 -0.4407 0.0499 ...
## $ species: Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ time : Factor w/ 50 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
库(plm)
plm1这是一个老问题,但我想指出一些容易漏掉并可能导致严重错误的问题。答案是正确的,但我想我应该澄清为什么需要这样一种变通方法,因为它可能并不明显
在阅读过程中,我学到了以下几点:如前所述,plm不一定按照给定给函数的顺序保存数据。这意味着,如果您不小心,只需在plm对象上使用residuals()
函数,然后将其连接到您的数据,就会导致错误的残差分组到错误的数据行!作为一个例子,考虑如下:
require(plm)
data("Gasoline") # The Gasoline dataset from the plm package
plm1 <- plm(lgaspcar ~ lincomep + lrpmg + lcarpcap, data=Gasoline, method = "within", index = c("country", "year"))
coef(plm1)
lincomep lrpmg lcarpcap
0.6622497 -0.3217025 -0.6404829
head(residuals(plm1))
1 2 3 4 5 6
-0.18814207 -0.19642727 -0.14874420 -0.12476346 -0.12114060 -0.08684045
这意味着,如果我们使用汽油2作为我们正在使用的数据集,那么在Gasoline2
和residuals(plm2)
上使用类似cbind()
的函数将导致错误的残差连接到观测值
head(cbind(Gasoline, residuals(plm1)))
country year lgaspcar lincomep lrpmg lcarpcap residuals(plm1)
1 AUSTRIA 1960 4.173244 -6.474277 -0.3345476 -9.766840 -0.18814207
2 AUSTRIA 1961 4.100989 -6.426006 -0.3513276 -9.608622 -0.19642727
3 AUSTRIA 1962 4.073177 -6.407308 -0.3795177 -9.457257 -0.14874420
4 AUSTRIA 1963 4.059509 -6.370679 -0.4142514 -9.343155 -0.12476346
5 AUSTRIA 1964 4.037689 -6.322247 -0.4453354 -9.237739 -0.12114060
6 AUSTRIA 1965 4.033983 -6.294668 -0.4970607 -9.123903 -0.08684045
head(cbind(Gasoline2, residuals(plm2)))
country year lgaspcar lincomep lrpmg lcarpcap residuals(plm2)
258 SWEDEN 1970 3.989372 -7.732610 -2.7335921 -8.164506 -0.18814207
7 AUSTRIA 1966 4.047537 -6.252545 -0.4668377 -9.019822 -0.19642727
64 DENMARK 1966 4.233643 -5.851866 -0.3961885 -8.681541 -0.14874420
73 DENMARK 1975 4.033015 -5.612967 -0.3939543 -8.274632 -0.12476346
268 SWITZERL 1961 4.441330 -6.111640 -0.8655847 -9.158229 -0.12114060
186 JAPAN 1974 4.007964 -5.852553 -0.1909064 -8.846520 -0.08684045
如上所述,在汽油2示例中,残差分配到错误的行
发生了什么事?正如前面提到的,plm
并没有保持观测的顺序。使用dickoa在前面的回答中指出的attr()
函数,我们可以看到plm
按国家和年份重新组织数据
head( attr(residuals(plm2), "index") )
country year
1 AUSTRIA 1960
2 AUSTRIA 1961
3 AUSTRIA 1962
4 AUSTRIA 1963
5 AUSTRIA 1964
6 AUSTRIA 1965
这就是原始汽油数据的结构,这就是残差以相同顺序呈现的原因
因此,我们可以使用attr(残差(plm2),“指数”)
为我们提供残差及其相应的国家和年份指标,以便将残差添加到原始数据中。如前所述,plyr
包对此非常有帮助
require(plyr)
resids2 <- data.frame(residual = residuals(plm2), attr(residuals(plm2), "index"))
Gasoline2$year <- factor(Gasoline2$year) # Needed since resids2$year is a factor, and Gasoline2$years was an integer. plyr does not accept them to be of different types.
Gasoline2 <- join(Gasoline2, resids2, by = c("country", "year"))
head(Gasoline2)
country year lgaspcar lincomep lrpmg lcarpcap residual
1 SWEDEN 1970 3.989372 -7.732610 -2.7335921 -8.164506 -0.02468148
2 AUSTRIA 1966 4.047537 -6.252545 -0.4668377 -9.019822 -0.02479759
3 DENMARK 1966 4.233643 -5.851866 -0.3961885 -8.681541 0.03175032
4 DENMARK 1975 4.033015 -5.612967 -0.3939543 -8.274632 -0.06575219
5 SWITZERL 1961 4.441330 -6.111640 -0.8655847 -9.158229 -0.05789130
6 JAPAN 1974 4.007964 -5.852553 -0.1909064 -8.846520 -0.21957156
require(plyr)
这能满足你的需要吗<代码>iris$residuals@RichardHerron:行与索引匹配吗?是的,除非你缺少观察结果。我使用complete.cases
来确保我的数据中没有遗漏的观察值。由于plm能够自动使用不平衡面板,如果残差有na,我们是否也可以将其合并到原始数据帧中?
head(cbind(Gasoline, residuals(plm1)))
country year lgaspcar lincomep lrpmg lcarpcap residuals(plm1)
1 AUSTRIA 1960 4.173244 -6.474277 -0.3345476 -9.766840 -0.18814207
2 AUSTRIA 1961 4.100989 -6.426006 -0.3513276 -9.608622 -0.19642727
3 AUSTRIA 1962 4.073177 -6.407308 -0.3795177 -9.457257 -0.14874420
4 AUSTRIA 1963 4.059509 -6.370679 -0.4142514 -9.343155 -0.12476346
5 AUSTRIA 1964 4.037689 -6.322247 -0.4453354 -9.237739 -0.12114060
6 AUSTRIA 1965 4.033983 -6.294668 -0.4970607 -9.123903 -0.08684045
head(cbind(Gasoline2, residuals(plm2)))
country year lgaspcar lincomep lrpmg lcarpcap residuals(plm2)
258 SWEDEN 1970 3.989372 -7.732610 -2.7335921 -8.164506 -0.18814207
7 AUSTRIA 1966 4.047537 -6.252545 -0.4668377 -9.019822 -0.19642727
64 DENMARK 1966 4.233643 -5.851866 -0.3961885 -8.681541 -0.14874420
73 DENMARK 1975 4.033015 -5.612967 -0.3939543 -8.274632 -0.12476346
268 SWITZERL 1961 4.441330 -6.111640 -0.8655847 -9.158229 -0.12114060
186 JAPAN 1974 4.007964 -5.852553 -0.1909064 -8.846520 -0.08684045
head( attr(residuals(plm2), "index") )
country year
1 AUSTRIA 1960
2 AUSTRIA 1961
3 AUSTRIA 1962
4 AUSTRIA 1963
5 AUSTRIA 1964
6 AUSTRIA 1965
require(plyr)
resids2 <- data.frame(residual = residuals(plm2), attr(residuals(plm2), "index"))
Gasoline2$year <- factor(Gasoline2$year) # Needed since resids2$year is a factor, and Gasoline2$years was an integer. plyr does not accept them to be of different types.
Gasoline2 <- join(Gasoline2, resids2, by = c("country", "year"))
head(Gasoline2)
country year lgaspcar lincomep lrpmg lcarpcap residual
1 SWEDEN 1970 3.989372 -7.732610 -2.7335921 -8.164506 -0.02468148
2 AUSTRIA 1966 4.047537 -6.252545 -0.4668377 -9.019822 -0.02479759
3 DENMARK 1966 4.233643 -5.851866 -0.3961885 -8.681541 0.03175032
4 DENMARK 1975 4.033015 -5.612967 -0.3939543 -8.274632 -0.06575219
5 SWITZERL 1961 4.441330 -6.111640 -0.8655847 -9.158229 -0.05789130
6 JAPAN 1974 4.007964 -5.852553 -0.1909064 -8.846520 -0.21957156