R 给出误差的回归模型上的预测函数_R_Lm_Predict

R 给出误差的回归模型上的预测函数

R 给出误差的回归模型上的预测函数,r,lm,predict,R,Lm,Predict,我试图根据多项式模型预测y变量的预测值 lumber.predict.plm=lm(lumber.unemployment.women$lumber.1980.2000 ~ scale(lumber.unemployment.women$woman.1980.2000) + I(scale(lumber.unemployment.women$woman.1980.2000)^2)) xmin=mi

我试图根据多项式模型预测y变量的预测值

lumber.predict.plm=lm(lumber.unemployment.women$lumber.1980.2000 ~ 
                        scale(lumber.unemployment.women$woman.1980.2000) +
                        I(scale(lumber.unemployment.women$woman.1980.2000)^2))

xmin=min(lumber.unemployment.women$woman.1980.2000)
xmax=max(lumber.unemployment.women$woman.1980.2000)
predicted.lumber.whole=data.frame(x=seq(xmin, xmax, length.out=500))
predicted.lumber.whole$lumber=predict(lumber.predict.plm,newdata=predicted.lumber.whole,
                                       interval="confidence")

除最后一个命令外，上述所有命令都可以正常工作。它给出了以下错误-

predicted.lumber.whole$lumber=predict(lumber.predict.plm,newdata=predicted.lumber.whole,
+                                        interval="confidence")

#Error in `$<-.data.frame`(`*tmp*`, "lumber", value = c(134.507238798567,  : 
#  replacement has 252 rows, data has 500
#In addition: Warning message:
#'newdata' had 500 rows but variables found have 252 rows

为什么预测值取决于我在数据框中观察到的数量？

我认为以下是你的问题，尽管错误信息对我来说似乎有点模糊。以下是您的代码的简化版本：

L=data.frame(woman=1:100, lumber=1:100+rnorm(100))
L.lm= lm(lumber ~ woman, data=L) 
xmin =-20; xmax= 120;

以下给出了一个错误，因为原始数据在新数据中没有“x”变量。请注意，上面的

lm（）

没有自动将其分配给名为“x”的变量

而是在寻找“女人”。因此，如果你做了

summary（L.lm）

你会发现系数是“woman”而不是“x”

以下工作与原始数据和新数据一样，包含相同的变量

nd=data.frame(woman=seq(xmin, xmax, length.out=500))
predict(L.lm, newdata=nd,interval="confidence")

        fit       lwr       upr
1 -20.32932 -20.85072 -19.80792
2 -20.04737 -20.56699 -19.52775
3 -19.76542 -20.28327 -19.24757
4 -19.48347 -19.99955 -18.96740
5 -19.20153 -19.71582 -18.68723
6 -18.91958 -19.43210 -18.40705
etc..

ps需要明确的是，这也适用于

L.lm= lm(lumber ~ poly(woman,2), data=L)

一种更简洁的多项式拟合表示方法。

刚刚修改了线性模型名称。。而且效果很好。但我不知道错误的根本原因！！如果有人能解释早期错误的原因，那就太好了。修改后的脚本如下所示

lumber.predict.plm1=lm(lumber.1980.2000 ~ scale(woman.1980.2000) +
                        I(scale(woman.1980.2000)^2), data=lumber.unemployment.women)
xmin=min(lumber.unemployment.women$woman.1980.2000)
xmax=max(lumber.unemployment.women$woman.1980.2000)
predicted.lumber.all=data.frame(woman.1980.2000=seq(xmin,xmax,length.out=100))
predicted.lumber.all$lumber=predict(lumber.predict.plm1,newdata=predicted.lumber.all)
> str(predicted.lumber.all)
'data.frame':   100 obs. of  2 variables:
 $ woman.1980.2000: num  3.3 3.36 3.42 3.48 3.54 ...
 $ lumber         : num  195 193 192 190 188 ...

尝试输入

predict

您使用的所有变量，其中包括多项式项。不幸的是，即使在我更改了变量名之后，它仍然存在相同的错误：（.>predicted.lumber.whole=data.frame（woman.1980.2000=seq（xmin，xmax，length.out=500））>predicted.lumber.whole$lumber=predict（lumber.predict.plm，newdata=predicted.lumber.whole，+interval=“confidence”）错误

$感谢关于多项式拟合的说明。你知道一种方法吗，如果可以使用多边形函数创建一个模型，但使用缩放x变量？即L.lm=lm（lumber~poly（scale（woman），2，data=L）但是当我绘制图形时，它应该是lumber~woman而不是scale（woman）。你知道实现它的方法吗？不，很抱歉，我看不到在没有赋值的情况下执行时发生的其他错误：predict（lumber.predict.plm，newdata=predicted.lumber.whole，interval=“confidence”）在《IPSUR》一书第304页中，建议对回归进行缩放，即使在处理一个变量时也是如此。但度数不止一个。反省一下，是的，关于缩放，你是对的。对不起，不，我想不出如何与原始值作图。
L.lm= lm(lumber ~ poly(woman,2), data=L)

lumber.predict.plm1=lm(lumber.1980.2000 ~ scale(woman.1980.2000) +
                        I(scale(woman.1980.2000)^2), data=lumber.unemployment.women)
xmin=min(lumber.unemployment.women$woman.1980.2000)
xmax=max(lumber.unemployment.women$woman.1980.2000)
predicted.lumber.all=data.frame(woman.1980.2000=seq(xmin,xmax,length.out=100))
predicted.lumber.all$lumber=predict(lumber.predict.plm1,newdata=predicted.lumber.all)
> str(predicted.lumber.all)
'data.frame':   100 obs. of  2 variables:
 $ woman.1980.2000: num  3.3 3.36 3.42 3.48 3.54 ...
 $ lumber         : num  195 193 192 190 188 ...