R 分段回归中断点数的选择
我试图为响应变量Y估计X中的多个断点。当我在R中运行分段包时,如果我在psi语句中指定1点,则在X=14处得到1个估计断点,如果我在psi语句中指定2点,则在X=6.5和X=11.4处得到两个估计断点。如何确定2个断点是最佳断点还是1个断点是最佳断点?请参阅下面的代码和输出: 指定1个断点:R 分段回归中断点数的选择,r,breakpoints,non-linear-regression,piecewise,R,Breakpoints,Non Linear Regression,Piecewise,我试图为响应变量Y估计X中的多个断点。当我在R中运行分段包时,如果我在psi语句中指定1点,则在X=14处得到1个估计断点,如果我在psi语句中指定2点,则在X=6.5和X=11.4处得到两个估计断点。如何确定2个断点是最佳断点还是1个断点是最佳断点?请参阅下面的代码和输出: 指定1个断点: segmented.glm(obj = fit.glm, seg.Z = ~x, psi = 10) Estimated Break-Point(s):
segmented.glm(obj = fit.glm, seg.Z = ~x, psi = 10)
Estimated Break-Point(s):
Est. St.Err
psi1.x 14 2.691
Null deviance: 230311 on 1509 degrees of freedom
Residual deviance: 175795 on 1480 degrees of freedom
AIC: 11531
Convergence attained in 0 iter. (rel. change 1.5525e-08)
> slope(fit.seg)
$x
Est. St.Err. t value CI(95%).l CI(95%).u
slope1 -0.847880 0.097683 -8.679900 -1.0393 -0.65643
slope2 0.036962 0.574770 0.064308 -1.0896 1.16350
指定2个断点:
fit.seg<-segmented(fit.glm, seg.Z=~x, psi= c(6, 11))
Estimated Break-Point(s):
Est. St.Err
psi1.x 6.562 1.771
psi2.x 11.398 1.660
Null deviance: 230311 on 1509 degrees of freedom
Residual deviance: 175594 on 1478 degrees of freedom
AIC: 11533
Convergence attained in 1 iter. (rel. change 0)
> slope(fit.seg)
$x
Est. St.Err. t value CI(95%).l CI(95%).u
slope1 -0.56943 0.23681 -2.40460 -1.03360 -0.10530
slope2 -1.25180 0.38974 -3.21190 -2.01570 -0.48794
slope3 -0.17365 0.31700 -0.54781 -0.79495 0.44765
fit.seg斜率(fit.seg)
$x
美国东部时间。圣埃尔。t值CI(95%)。l CI(95%)。u
斜率1-0.56943 0.23681-2.40460-1.03360-0.10530
坡度2-1.25180 0.38974-3.21190-2.01570-0.48794
斜率3-0.17365 0.31700-0.54781-0.79495 0.44765
我使用了seg.control,但不知道如何解释输出。(基于Muggeo,V.M.R.(2008)《分段:拟合折线关系回归模型的R包》,R新闻8/1,20-25。)
>o斜率(o)#默认置信度为0.95(conf.level=0.95)
$x
美国东部时间。圣埃尔。t值CI(95%)。l CI(95%)。u
斜率1-0.56943 0.23681-2.40460-1.03360-0.10530
坡度2-1.25180 0.38974-3.21190-2.01570-0.48794
斜率3-0.17365 0.31700-0.54781-0.79495 0.44765
>o斜率(o)#默认置信水平为0.95(conf.level=0.95)
$x
美国东部时间。圣埃尔。t值CI(95%)。l CI(95%)。u
斜率1-0.847880 0.097683-8.679900-1.0393-0.65643
坡度20.036966 0.574770 0.064314-1.0896 1.16350
有人能帮我弄清楚如何确定2个断点是更好的估计值还是1个断点吗?函数selgmented()(也在R软件包segmented中)是一个包装器,用于通过假设测试(例如分数测试)或BIC选择“最佳”断点数。目前,通过假设检验进行的选择仅限于选择0、1或2个断点。
亲切问候,,
维托
> o <- segmented(fit.glm, seg.Z=~x, psi=NA, control=seg.control(display=FALSE, K=2))
Warning message:
max number of iterations (1) attained
> slope(o) # defaults to confidence level of 0.95 (conf.level=0.95)
$x
Est. St.Err. t value CI(95%).l CI(95%).u
slope1 -0.56943 0.23681 -2.40460 -1.03360 -0.10530
slope2 -1.25180 0.38974 -3.21190 -2.01570 -0.48794
slope3 -0.17365 0.31700 -0.54781 -0.79495 0.44765
> o <- segmented(fit.glm, seg.Z=~x, psi=NA, control=seg.control(display=FALSE, K=1))
Warning messages:
1: max number of iterations (1) attained
2: max number of iterations (1) attained
> slope(o) # defaults to confidence level of 0.95 (conf.level=0.95)
$x
Est. St.Err. t value CI(95%).l CI(95%).u
slope1 -0.847880 0.097683 -8.679900 -1.0393 -0.65643
slope2 0.036966 0.574770 0.064314 -1.0896 1.16350