为生存定义分布::survreg()
我尝试使用伽马分布拟合survreg模型 以下为生存定义分布::survreg(),r,survival-analysis,R,Survival Analysis,我尝试使用伽马分布拟合survreg模型 以下?survreg.distributions我这样定义了我的自定义发行版: gamma <- list(name = 'gamma', parms = c(2,2), init = function(x, weights, ...){ c(median(x), mad(x)) }, density = function(x, parms){
?survreg.distributions
我这样定义了我的自定义发行版:
gamma <- list(name = 'gamma',
parms = c(2,2),
init = function(x, weights, ...){
c(median(x), mad(x))
},
density = function(x, parms){
shape <- parms[1]
scale <- parms[2]
cbind(pgamma(x, shape=shape, scale=scale),
1-pgamma(x, shape=shape, scale=scale),
dgamma(x, shape=shape, scale=scale),
(shape-1)/x - 1/scale,
(shape-1)*(shape-2)/x^2 - 2*(shape-1)/(x*scale) + 1/scale^2)
},
quantile = function(p, parms) {
qgamma(p, shape=parms[1], scale=parms[2])
},
deviance = function(...) stop('deviance residuals not defined')
)
这个错误来自一些C代码,但我认为它产生的时间要早得多
有什么关于survreg的提示/建议/替代方案吗?我找到了
flexsurv
包,它实现了广义伽马分布
对于威布尔分布,survreg
和flexsurvreg
的估计值相似(但注意不同的参数化:
require(survival)
summary(survreg(Surv(log(time), status) ~ ph.ecog + sex, data = lung, dist='weibull'))
Call:
survreg(formula = Surv(log(time), status) ~ ph.ecog + sex, data = lung,
dist = "weibull")
Value Std. Error z p
(Intercept) 1.7504 0.0364 48.13 0.00e+00
ph.ecog -0.0660 0.0158 -4.17 3.10e-05
sex 0.0763 0.0237 3.22 1.27e-03
Log(scale) -1.9670 0.0639 -30.77 6.36e-208
Scale= 0.14
Weibull distribution
Loglik(model)= -270.5 Loglik(intercept only)= -284.3
Chisq= 27.62 on 2 degrees of freedom, p= 1e-06
Number of Newton-Raphson Iterations: 6
n=227 (1 observation deleted due to missingness)
require(flexsurv)
flexsurvreg(Surv(log(time), status) ~ ph.ecog + sex, data = lung, dist='weibull')
Call:
flexsurvreg(formula = Surv(log(time), status) ~ ph.ecog + sex, data = lung, dist = "weibull")
Maximum likelihood estimates:
est L95% U95%
shape 7.1500 6.3100 8.1000
scale 5.7600 5.3600 6.1800
ph.ecog -0.0660 -0.0970 -0.0349
sex 0.0763 0.0299 0.1230
N = 227, Events: 164, Censored: 63
Total time at risk: 1232.1
Log-likelihood = -270.5, df = 4
AIC = 549
使用flexsurvreg,我们可以将广义伽马分布拟合到此数据:
flexsurvreg(Surv(log(time), status) ~ ph.ecog + sex, data = lung, dist='gengamma')
Call:
flexsurvreg(formula = Surv(log(time), status) ~ ph.ecog + sex, data = lung, dist = "gengamma")
Maximum likelihood estimates:
est L95% U95%
mu 1.7800 1.7100 1.8600
sigma 0.1180 0.0971 0.1440
Q 1.4600 1.0200 1.9100
ph.ecog -0.0559 -0.0853 -0.0266
sex 0.0621 0.0178 0.1060
N = 227, Events: 164, Censored: 63
Total time at risk: 1232.1
Log-likelihood = -267.57, df = 5
AIC = 545.15
物流配送(与survreg
相比)不是内置的,但可以轻松定制(参见flexsurvreg
示例)
我没有对它进行过太多的测试,但是
flexsurv
似乎是survival
的一个很好的替代方案。我看到了两种可能性。你可能会在停止呼叫时抛出错误,或者如果你将负数传递给log()如果你不提供数据,你会以为NaN会知道的?@DWin:谢谢,我会试着调试C代码并检查输入。我的例子是可复制的,肺数据随生存包一起提供。log(time)很好,都是肯定的。在Markmail中进行搜索时,我发现survreg打算用于具有位置比例参数的发行版,而gamma不在该系列中。在进一步搜索后,我找到了flexsurv
包-请参阅下面的答案。我进行了一些调试;问题似乎在于,在survreg.fit
,一个局部函数derfun
用于计算密度的导数,这会返回数个-Inf
s作为一阶导数,从而返回数个NaN
s作为二阶导数。这可能与位置-比例分布无关。例如,指数分布is被编码到survreg.distributions
(虽然是一种转换),但它实际上是gamma分布的一个特例。感谢您发布对survreg
问题的答案。
flexsurvreg(Surv(log(time), status) ~ ph.ecog + sex, data = lung, dist='gengamma')
Call:
flexsurvreg(formula = Surv(log(time), status) ~ ph.ecog + sex, data = lung, dist = "gengamma")
Maximum likelihood estimates:
est L95% U95%
mu 1.7800 1.7100 1.8600
sigma 0.1180 0.0971 0.1440
Q 1.4600 1.0200 1.9100
ph.ecog -0.0559 -0.0853 -0.0266
sex 0.0621 0.0178 0.1060
N = 227, Events: 164, Censored: 63
Total time at risk: 1232.1
Log-likelihood = -267.57, df = 5
AIC = 545.15