Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/72.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
尽管singular.ok=FALSE,lm()报告结果,而solve(t(X)%%*%X)%%*%(t(X)%%*%y)报告错误(正确)_R_Regression_Singular - Fatal编程技术网

尽管singular.ok=FALSE,lm()报告结果,而solve(t(X)%%*%X)%%*%(t(X)%%*%y)报告错误(正确)

尽管singular.ok=FALSE,lm()报告结果,而solve(t(X)%%*%X)%%*%(t(X)%%*%y)报告错误(正确),r,regression,singular,R,Regression,Singular,我的测试数据集是 > y DLogPrice [1,] 3.4232680 [2,] -1.0099196 [3,] 0.7867983 [4,] -1.2224441 [5,] 3.5718083 [6,] -0.4550516 [7,] 1.6733032 [8,] 1.6540079 [9,] 0.6122239 [10,] -1.3530304 [11,] -18.9058749 [12,] 15.6916978

我的测试数据集是

> y
        DLogPrice
 [1,]   3.4232680
 [2,]  -1.0099196
 [3,]   0.7867983
 [4,]  -1.2224441
 [5,]   3.5718083
 [6,]  -0.4550516
 [7,]   1.6733032
 [8,]   1.6540079
 [9,]   0.6122239
[10,]  -1.3530304
[11,] -18.9058749
[12,]  15.6916978
[13,]   1.9088818

求解y=Xb中的线性回归系数向量b

solve(t(X) %*% X) %*% (t(X) %*% y)
在尝试反转X'X时导致奇异性错误:

> solve(t(X) %*% X) %*% (t(X) %*% y)
Error in solve.default(t(X) %*% X) : 
  system is computationally singular: reciprocal condition number = 8.6658e-43
我不明白为什么stats::lm()会报告一个结果,尽管设置了
singularity.ok=FALSE
,如中所示

> df <- data.frame(y,X)
> 
> test.lm <- stats::lm(DLogPrice ~ DQ + DQInverseSize + DQLogSize + DQSize + DQSize2 -1, data=df, singular.ok = FALSE)
> summary(test.lm)

Call:
stats::lm(formula = DLogPrice ~ DQ + DQInverseSize + DQLogSize + 
    DQSize + DQSize2 - 1, data = df, singular.ok = FALSE)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.2934 -1.6251  0.2413  1.0087  6.0501 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)  
DQ            -4.492e+07  1.956e+07  -2.297   0.0507 .
DQInverseSize  1.503e+11  6.497e+10   2.314   0.0494 *
DQLogSize      4.038e+06  1.759e+06   2.295   0.0509 .
DQSize        -3.606e+01  1.585e+01  -2.275   0.0524 .
DQSize2        5.353e-05  2.374e-05   2.255   0.0542 .
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.618 on 8 degrees of freedom
Multiple R-squared:  0.8371,    Adjusted R-squared:  0.7353 
F-statistic: 8.222 on 5 and 8 DF,  p-value: 0.005156
>df
>test.lm摘要(test.lm)
电话:
统计:lm(公式=DLogPrice~DQ+DQInverseSize+DQLogSize+
DQSize+DQSize2-1,数据=df,单数。ok=FALSE)
残差:
最小1季度中值3季度最大值
-4.2934 -1.6251  0.2413  1.0087  6.0501 
系数:
估计标准误差t值Pr(>t)
DQ-4.492e+071.956e+07-2.2970.0507。
DQInverseSize 1.503e+116.497e+102.314 0.0494*
DQLogSize 4.038e+06 1.759e+06 2.295 0.0509。
DQSize-3.606e+01 1.585e+01-2.275 0.0524。
DQSize2 5.353e-05 2.374e-05 2.255 0.0542。
---
签名。代码:0'***'0.001'***'0.01'*'0.05'.'0.1''1
剩余标准误差:8个自由度上的3.618
多重R平方:0.8371,调整后的R平方:0.7353
F统计:5和8 DF上的8.222,p值:0.005156
我在这里遗漏了什么/误解了什么?
感谢您的想法。

在引擎盖下,R的
lm.fit()
函数使用QR分解,使其能够更可靠地处理类似这样的近乎奇异的情况:

y3.547059e+00-1.85387e+076.137535e+1011.668251e+06-1.508519e+01
#>DQSize2
#>2.268808e-05
coef(lm(as.矩阵(y)~as.矩阵(x)))
#>(截距)as.矩阵(x)DQ as.矩阵(x)DQInverseSize
#>3.547059e+00-1.85387E+07 6.137535e+10
#>as.matrix(x)DQLogSize as.matrix(x)DQSize as.matrix(x)DQSize2
#>1.668251e+06-1.508519e+01 2.268808e-05
由(v2.0.0)于2021-06-01创建

OLS
solve(t(x)%%*%x)%%*%t(x)%%*%y
的教科书定义在计算上非常低效,不是实现OLS的好方法

> df <- data.frame(y,X)
> 
> test.lm <- stats::lm(DLogPrice ~ DQ + DQInverseSize + DQLogSize + DQSize + DQSize2 -1, data=df, singular.ok = FALSE)
> summary(test.lm)

Call:
stats::lm(formula = DLogPrice ~ DQ + DQInverseSize + DQLogSize + 
    DQSize + DQSize2 - 1, data = df, singular.ok = FALSE)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.2934 -1.6251  0.2413  1.0087  6.0501 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)  
DQ            -4.492e+07  1.956e+07  -2.297   0.0507 .
DQInverseSize  1.503e+11  6.497e+10   2.314   0.0494 *
DQLogSize      4.038e+06  1.759e+06   2.295   0.0509 .
DQSize        -3.606e+01  1.585e+01  -2.275   0.0524 .
DQSize2        5.353e-05  2.374e-05   2.255   0.0542 .
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.618 on 8 degrees of freedom
Multiple R-squared:  0.8371,    Adjusted R-squared:  0.7353 
F-statistic: 8.222 on 5 and 8 DF,  p-value: 0.005156