为生存定义分布：：survreg（）_R_Survival Analysis

为生存定义分布：：survreg（）

为生存定义分布：：survreg（）,r,survival-analysis,R,Survival Analysis,我尝试使用伽马分布拟合survreg模型以下？survreg.distributions我这样定义了我的自定义发行版： gamma <- list(name = 'gamma', parms = c(2,2), init = function(x, weights, ...){ c(median(x), mad(x)) }, density = function(x, parms){

我尝试使用伽马分布拟合survreg模型

以下

？survreg.distributions

我这样定义了我的自定义发行版：

gamma <- list(name = 'gamma',
          parms = c(2,2),
          init = function(x, weights, ...){
            c(median(x), mad(x))
          },
          density = function(x, parms){
            shape <- parms[1]
            scale <- parms[2]
            cbind(pgamma(x, shape=shape, scale=scale),
                  1-pgamma(x, shape=shape, scale=scale),
                  dgamma(x, shape=shape, scale=scale),
                  (shape-1)/x - 1/scale,
                  (shape-1)*(shape-2)/x^2 - 2*(shape-1)/(x*scale) + 1/scale^2)
          },
          quantile = function(p, parms) {
            qgamma(p, shape=parms[1], scale=parms[2])
          },
          deviance = function(...) stop('deviance residuals not defined')
)

这个错误来自一些C代码，但我认为它产生的时间要早得多

有什么关于survreg的提示/建议/替代方案吗？

我找到了

flexsurv

包，它实现了广义伽马分布

对于威布尔分布，

survreg

和

flexsurvreg

的估计值相似（但注意不同的参数化：

require(survival)
summary(survreg(Surv(log(time), status) ~ ph.ecog + sex, data = lung, dist='weibull'))

Call:
survreg(formula = Surv(log(time), status) ~ ph.ecog + sex, data = lung, 
    dist = "weibull")
              Value Std. Error      z         p
(Intercept)  1.7504     0.0364  48.13  0.00e+00
ph.ecog     -0.0660     0.0158  -4.17  3.10e-05
sex          0.0763     0.0237   3.22  1.27e-03
Log(scale)  -1.9670     0.0639 -30.77 6.36e-208

Scale= 0.14 

Weibull distribution
Loglik(model)= -270.5   Loglik(intercept only)= -284.3
    Chisq= 27.62 on 2 degrees of freedom, p= 1e-06 
Number of Newton-Raphson Iterations: 6 
n=227 (1 observation deleted due to missingness)


require(flexsurv)
flexsurvreg(Surv(log(time), status) ~ ph.ecog + sex, data = lung, dist='weibull')

Call:
flexsurvreg(formula = Surv(log(time), status) ~ ph.ecog + sex,     data = lung, dist = "weibull")

Maximum likelihood estimates: 
            est    L95%    U95%
shape    7.1500  6.3100  8.1000
scale    5.7600  5.3600  6.1800
ph.ecog -0.0660 -0.0970 -0.0349
sex      0.0763  0.0299  0.1230

N = 227,  Events: 164,  Censored: 63
Total time at risk: 1232.1
Log-likelihood = -270.5, df = 4
AIC = 549

使用flexsurvreg，我们可以将广义伽马分布拟合到此数据：

flexsurvreg(Surv(log(time), status) ~ ph.ecog + sex, data = lung, dist='gengamma')

Call:
flexsurvreg(formula = Surv(log(time), status) ~ ph.ecog + sex,     data = lung, dist = "gengamma")

Maximum likelihood estimates: 
            est    L95%    U95%
mu       1.7800  1.7100  1.8600
sigma    0.1180  0.0971  0.1440
Q        1.4600  1.0200  1.9100
ph.ecog -0.0559 -0.0853 -0.0266
sex      0.0621  0.0178  0.1060

N = 227,  Events: 164,  Censored: 63
Total time at risk: 1232.1
Log-likelihood = -267.57, df = 5
AIC = 545.15

物流配送（与

survreg

相比）不是内置的，但可以轻松定制（参见

flexsurvreg

示例）

我没有对它进行过太多的测试，但是

flexsurv

似乎是

survival

的一个很好的替代方案。我看到了两种可能性。你可能会在停止呼叫时抛出错误，或者如果你将负数传递给log（）如果你不提供数据，你会以为NaN会知道的？@DWin：谢谢，我会试着调试C代码并检查输入。我的例子是可复制的，肺数据随生存包一起提供。log（time）很好，都是肯定的。在Markmail中进行搜索时，我发现survreg打算用于具有位置比例参数的发行版，而gamma不在该系列中。在进一步搜索后，我找到了

flexsurv

包-请参阅下面的答案。我进行了一些调试；问题似乎在于，在

survreg.fit

，一个局部函数

derfun

用于计算密度的导数，这会返回数个

-Inf

s作为一阶导数，从而返回数个

NaN

s作为二阶导数。这可能与位置-比例分布无关。例如，指数分布is被编码到

survreg.distributions

（虽然是一种转换），但它实际上是gamma分布的一个特例。感谢您发布对

survreg

问题的答案。

flexsurvreg(Surv(log(time), status) ~ ph.ecog + sex, data = lung, dist='gengamma')

Call:
flexsurvreg(formula = Surv(log(time), status) ~ ph.ecog + sex,     data = lung, dist = "gengamma")

Maximum likelihood estimates: 
            est    L95%    U95%
mu       1.7800  1.7100  1.8600
sigma    0.1180  0.0971  0.1440
Q        1.4600  1.0200  1.9100
ph.ecog -0.0559 -0.0853 -0.0266
sex      0.0621  0.0178  0.1060

N = 227,  Events: 164,  Censored: 63
Total time at risk: 1232.1
Log-likelihood = -267.57, df = 5
AIC = 545.15