Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/three.js/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 向逻辑回归(glm)模型中添加多个解释变量会产生错误吗?_R_Glm - Fatal编程技术网

R 向逻辑回归(glm)模型中添加多个解释变量会产生错误吗?

R 向逻辑回归(glm)模型中添加多个解释变量会产生错误吗?,r,glm,R,Glm,我已尝试拟合以下线性模型: ad.glm.all <- glm(WinLoss ~ Score + Margin + Opposition + Venue + Disposals + Marks + Goals + Behinds + Hitouts + Tackles + Rebound50s + Inside50s + Clearances + Clangers + FreesFor + ContendedPossessions + ContestedMarks + MarksIns

我已尝试拟合以下线性模型:

ad.glm.all <- glm(WinLoss ~  Score + Margin + Opposition + Venue + Disposals + Marks + Goals + Behinds + Hitouts + Tackles + Rebound50s + Inside50s + Clearances + Clangers + FreesFor + ContendedPossessions + ContestedMarks + MarksInside50 + OnePercenters + Bounces+GoalAssists, 
                  data = ad.train, family = binomial)
当我看到这个回归模型的总结时,我得到:

Call:
glm(formula = WinLoss ~ Score + Margin + Disposals + Marks + 
    Goals + Behinds + Hitouts + Tackles + Rebound50s + Inside50s + 
    Clearances + Clangers + FreesFor + ContendedPossessions + 
    ContestedMarks + MarksInside50 + OnePercenters + Bounces + 
    GoalAssists, family = binomial, data = ad.train)

Deviance Residuals: 
       Min          1Q      Median          3Q         Max  
-2.980e-05  -2.100e-08   2.100e-08   2.100e-08   3.569e-05  

Coefficients:
                       Estimate Std. Error z value Pr(>|z|)
(Intercept)          -8.578e+00  2.502e+06   0.000        1
Score                 4.194e+00  5.165e+04   0.000        1
Margin                2.187e+00  3.742e+03   0.001        1
Disposals             8.946e-02  3.549e+03   0.000        1
Marks                 1.427e-01  1.938e+03   0.000        1
Goals                -2.288e+01  3.082e+05   0.000        1
Behinds              -7.034e+00  5.482e+04   0.000        1
Hitouts               3.640e-02  5.167e+03   0.000        1
Tackles               8.939e-01  7.075e+03   0.000        1
Rebound50s           -2.064e-01  8.497e+03   0.000        1
Inside50s             5.645e-01  8.133e+03   0.000        1
Clearances           -1.930e-01  1.525e+04   0.000        1
Clangers             -2.040e-01  1.056e+04   0.000        1
FreesFor             -7.699e-01  1.762e+04   0.000        1
ContendedPossessions -5.752e-01  7.424e+03   0.000        1
ContestedMarks       -1.869e+00  1.069e+04   0.000        1
MarksInside50         6.742e-01  1.676e+04   0.000        1
OnePercenters         1.616e-01  6.888e+03   0.000        1
Bounces              -8.763e-01  7.669e+03   0.000        1
GoalAssists           7.570e-01  3.299e+04   0.000        1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1.2540e+02  on 91  degrees of freedom
Residual deviance: 7.1154e-09  on 72  degrees of freedom
AIC: 40

Number of Fisher Scoring iterations: 25
显然这里出了严重的问题,对吗?每个变量的P值不能都是1,Z值都是0;对吧?

我给了它一个谷歌,我能找到的最好的结果是有人建议错误可能是因为变量太多(考虑到我有多少变量,这是有道理的)。因此,我开始一个接一个地删除它们,并且每次尝试都会得到错误,直到我只有一个变量(x~y);只有这样我才不会出错

有人能给我解释一下这个错误是什么意思吗?为什么我所有的P值都是1,z值都是0

提前谢谢


-特洛伊城在我看来就像是一个过度装修的粗鲁例子。你可能想试试套索/弹性网。还有,你有没有检查过你的一个预测因素是否会导致已经完美的分离?@Roland我会看看套索/弹性网,猜测它们只是你可以添加到R中处理这类事情的包?我猜,因为这是体育数据,我在看赢/输,“利润”变量可能是一个完美的分隔符;有什么建议可以确定吗?那么分数是多少?把你的依赖者和预测者列成一个列联表。更新:我想我明白了为什么glm如此怪异;没有足够的数据。当我将数据分成不同的团队时,我的训练集中有20行数据,测试集中有5行数据(每个团队)。将所有团队合并到一个大数据框架中,得出的数字是这些数字的18倍。我很确定这就是我不再犯奇怪错误的原因。谢谢你们的帮助,小伙子们。
Call:
glm(formula = WinLoss ~ Score + Margin + Disposals + Marks + 
    Goals + Behinds + Hitouts + Tackles + Rebound50s + Inside50s + 
    Clearances + Clangers + FreesFor + ContendedPossessions + 
    ContestedMarks + MarksInside50 + OnePercenters + Bounces + 
    GoalAssists, family = binomial, data = ad.train)

Deviance Residuals: 
       Min          1Q      Median          3Q         Max  
-2.980e-05  -2.100e-08   2.100e-08   2.100e-08   3.569e-05  

Coefficients:
                       Estimate Std. Error z value Pr(>|z|)
(Intercept)          -8.578e+00  2.502e+06   0.000        1
Score                 4.194e+00  5.165e+04   0.000        1
Margin                2.187e+00  3.742e+03   0.001        1
Disposals             8.946e-02  3.549e+03   0.000        1
Marks                 1.427e-01  1.938e+03   0.000        1
Goals                -2.288e+01  3.082e+05   0.000        1
Behinds              -7.034e+00  5.482e+04   0.000        1
Hitouts               3.640e-02  5.167e+03   0.000        1
Tackles               8.939e-01  7.075e+03   0.000        1
Rebound50s           -2.064e-01  8.497e+03   0.000        1
Inside50s             5.645e-01  8.133e+03   0.000        1
Clearances           -1.930e-01  1.525e+04   0.000        1
Clangers             -2.040e-01  1.056e+04   0.000        1
FreesFor             -7.699e-01  1.762e+04   0.000        1
ContendedPossessions -5.752e-01  7.424e+03   0.000        1
ContestedMarks       -1.869e+00  1.069e+04   0.000        1
MarksInside50         6.742e-01  1.676e+04   0.000        1
OnePercenters         1.616e-01  6.888e+03   0.000        1
Bounces              -8.763e-01  7.669e+03   0.000        1
GoalAssists           7.570e-01  3.299e+04   0.000        1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 1.2540e+02  on 91  degrees of freedom
Residual deviance: 7.1154e-09  on 72  degrees of freedom
AIC: 40

Number of Fisher Scoring iterations: 25