R 如何在泊松回归中组合组来估计对比度？_R_Statistics_Emmeans

R 如何在泊松回归中组合组来估计对比度？

r statistics

R 如何在泊松回归中组合组来估计对比度？,r,statistics,emmeans,R,Statistics,Emmeans,我不确定这更像是一个编程或统计（即我缺乏理解）问题我有一个泊松混合模型，我想用它来比较不同时间段各组的平均计数 mod <- glmer(Y ~ TX_GROUP * time + (1|ID), data = dat, family = poisson) mod_em <- emmeans(mod, c("TX_GROUP","time"), type = "response") TX_GROUP time rate SE df asymp.LCL a

我不确定这更像是一个编程或统计（即我缺乏理解）问题

我有一个泊松混合模型，我想用它来比较不同时间段各组的平均计数

mod <- glmer(Y ~ TX_GROUP * time + (1|ID), data = dat, family = poisson)
mod_em <- emmeans(mod, c("TX_GROUP","time"), type = "response")

 TX_GROUP time     rate        SE  df asymp.LCL asymp.UCL
 0        1    5.743158 0.4566671 Inf  4.914366  6.711723
 1        1    5.529303 0.4639790 Inf  4.690766  6.517741
 0        2    2.444541 0.2981097 Inf  1.924837  3.104564
 1        2    1.467247 0.2307103 Inf  1.078103  1.996855
 0        3    4.570218 0.4121428 Inf  3.829795  5.453790
 1        3    1.676827 0.2472920 Inf  1.255904  2.238826

如果我尝试使该值与组合组的简单平均值不匹配。

要使用包中的示例数据，似乎没有问题，但我会在公式中使用分组

> warp.lm <- lm(breaks ~ wool*tension, data = warpbreaks)
> warp.emm <- emmeans(warp.lm, c("tension", "wool"))
> warp.emm
 tension wool   emmean       SE df lower.CL upper.CL
 L       A    44.55556 3.646761 48 37.22325 51.88786
 M       A    24.00000 3.646761 48 16.66769 31.33231
 H       A    24.55556 3.646761 48 17.22325 31.88786
 L       B    28.22222 3.646761 48 20.88992 35.55453
 M       B    28.77778 3.646761 48 21.44547 36.11008
 H       B    18.77778 3.646761 48 11.44547 26.11008

Confidence level used: 0.95

>warp.lm warp.emm warp.emm
张力羊毛emmean SE df lower.CL upper.CL
LA 44.55556 3.646761 48 37.22325 51.88786
M A 24.000003.646761 48 16.66769 31.33231
H A 24.55556 3.646761 48 17.22325 31.88786
L B 28.22223.646761 48 20.88992 35.55453
M B 28.77778 3.646761 48 21.44547 36.11008
H B 18.77778 3.646761 48 11.44547 26.11008
使用的置信水平：0.95

A的L和M之和应为44+24~68，B的L和M之和应为28+28~56

> contrast(warp.emm, list(A.LM = c(1, 1, 0, 0, 0, 0),
+                         B.LM = c(0, 0, 0, 1, 1, 0)))
 contrast estimate       SE df t.ratio p.value
 A.LM     68.55556 5.157299 48  13.293  <.0001
 B.LM     57.00000 5.157299 48  11.052  <.0001

对比度（warp.emm，list（A.LM=c（1,1,0,0,0）， +B.LM=c（0,0,0,1,1,0）））对比度估计SE df t.比值p.值

A.LM 68.55556 5.157299 48 13.293要使用包中的示例数据，似乎还可以，不过我会在公式中使用分组

> warp.lm <- lm(breaks ~ wool*tension, data = warpbreaks)
> warp.emm <- emmeans(warp.lm, c("tension", "wool"))
> warp.emm
 tension wool   emmean       SE df lower.CL upper.CL
 L       A    44.55556 3.646761 48 37.22325 51.88786
 M       A    24.00000 3.646761 48 16.66769 31.33231
 H       A    24.55556 3.646761 48 17.22325 31.88786
 L       B    28.22222 3.646761 48 20.88992 35.55453
 M       B    28.77778 3.646761 48 21.44547 36.11008
 H       B    18.77778 3.646761 48 11.44547 26.11008

Confidence level used: 0.95

>warp.lm warp.emm warp.emm
张力羊毛emmean SE df lower.CL upper.CL
LA 44.55556 3.646761 48 37.22325 51.88786
M A 24.000003.646761 48 16.66769 31.33231
H A 24.55556 3.646761 48 17.22325 31.88786
L B 28.22223.646761 48 20.88992 35.55453
M B 28.77778 3.646761 48 21.44547 36.11008
H B 18.77778 3.646761 48 11.44547 26.11008
使用的置信水平：0.95

A的L和M之和应为44+24~68，B的L和M之和应为28+28~56

> contrast(warp.emm, list(A.LM = c(1, 1, 0, 0, 0, 0),
+                         B.LM = c(0, 0, 0, 1, 1, 0)))
 contrast estimate       SE df t.ratio p.value
 A.LM     68.55556 5.157299 48  13.293  <.0001
 B.LM     57.00000 5.157299 48  11.052  <.0001

对比度（warp.emm，list（A.LM=c（1,1,0,0,0）， +B.LM=c（0,0,0,1,1,0）））对比度估计SE df t.比值p.值

A.LM 68.55556 5.157299 48 13.293首先，我建议您将两种对比放在一个列表中，例如

contr = list(`2+2|0` = c(0, 0, 1, 0, 1, 0),
             `2+3|1` = c(0, 0, 0, 1, 0, 1))

您必须决定何时要反变换。请参阅并注意关于“时间就是一切”的讨论。两个基本选项是：

一个选项：获取日志计数的边际平均值，然后进行反向转换：

mod_con = update(contrast(mod_emm, contr), tran = "log")
summary(mod_con, type = "response")

[需要调用

update

，因为

contrast

除去了特殊情况下的变换，因为它并不总是知道分配给任意线性函数的比例。例如，两个平方根的差不在平方根比例上。]

第二个选项：反向变换预测，然后求和：

mod_emmr = regrid(mod_emm) 
contrast(mod_emmr, contr)

这些结果之间的区别与几何平均值（选项1）和算术平均值（选项2）之间的区别相同。我怀疑它们中的任何一个是否会产生与原始边际平均数相同的结果，因为它们是基于您的模型的预测。就我个人而言，我认为第一种选择是更好的选择，因为总和是线性运算，模型在对数尺度上是线性的

补遗实际上还有第三个选项，即创建分组变量。我将用

pigs

数据集进行说明

> pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs)

现在，让我们创建一个分组因子

group

：

> pigs.emm = add_grouping(ref_grid(pigs.lm), "group", "percent", c("1&2","1&2","3&4","3&4"))
> str(pigs.emm)
'emmGrid' object with variables:
    source = fish, soy, skim
    percent =  9, 12, 15, 18
    group = 1&2, 3&4
Nesting structure:  percent %in% group
Transformation: “log”

现在获取

组

的EMM，注意它们只是各个级别的平均值：

> emmeans(pigs.emm, "group")
 group   emmean         SE df lower.CL upper.CL
 1&2   3.535084 0.02803816 23 3.477083 3.593085
 3&4   3.703931 0.03414907 23 3.633288 3.774574

Results are averaged over the levels of: source, percent 
Results are given on the log (not the response) scale. 
Confidence level used: 0.95

以下是反应量表的总结：

> summary(.Last.value, type = "response")
 group response       SE df lower.CL upper.CL
 1&2   34.29790 0.961650 23 32.36517 36.34605
 3&4   40.60662 1.386678 23 37.83703 43.57893

Results are averaged over the levels of: source, percent 
Confidence level used: 0.95 
Intervals are back-transformed from the log scale

这些是平均值，而不是总和，但除此之外它是有效的，并且转换不会像在

对比度（）中那样被破坏。首先，我建议您将两个对比度放在一个列表中，例如
contr = list(`2+2|0` = c(0, 0, 1, 0, 1, 0),
             `2+3|1` = c(0, 0, 0, 1, 0, 1))

您必须决定何时要反变换。请参阅并注意关于“时间就是一切”的讨论。两个基本选项是：
一个选项：获取日志计数的边际平均值，然后进行反向转换：
mod_con = update(contrast(mod_emm, contr), tran = "log")
summary(mod_con, type = "response")

[需要调用update
，因为contrast
除去了特殊情况下的变换，因为它并不总是知道分配给任意线性函数的比例。例如，两个平方根的差不在平方根比例上。]
第二个选项：反向变换预测，然后求和：
mod_emmr = regrid(mod_emm) 
contrast(mod_emmr, contr)

这些结果之间的区别与几何平均值（选项1）和算术平均值（选项2）之间的区别相同。我怀疑它们中的任何一个是否会产生与原始边际平均数相同的结果，因为它们是基于您的模型的预测。就我个人而言，我认为第一种选择是更好的选择，因为总和是线性运算，模型在对数尺度上是线性的
补遗
实际上还有第三个选项，即创建分组变量。我将用pigs
数据集进行说明
> pigs.lm <- lm(log(conc) ~ source + factor(percent), data = pigs)

现在，让我们创建一个分组因子group
：
> pigs.emm = add_grouping(ref_grid(pigs.lm), "group", "percent", c("1&2","1&2","3&4","3&4"))
> str(pigs.emm)
'emmGrid' object with variables:
    source = fish, soy, skim
    percent =  9, 12, 15, 18
    group = 1&2, 3&4
Nesting structure:  percent %in% group
Transformation: “log” 

现在获取组
的EMM，注意它们只是各个级别的平均值：
> emmeans(pigs.emm, "group")
 group   emmean         SE df lower.CL upper.CL
 1&2   3.535084 0.02803816 23 3.477083 3.593085
 3&4   3.703931 0.03414907 23 3.633288 3.774574

Results are averaged over the levels of: source, percent 
Results are given on the log (not the response) scale. 
Confidence level used: 0.95 

以下是反应量表的总结：
> summary(.Last.value, type = "response")
 group response       SE df lower.CL upper.CL
 1&2   34.29790 0.961650 23 32.36517 36.34605
 3&4   40.60662 1.386678 23 37.83703 43.57893

Results are averaged over the levels of: source, percent 
Confidence level used: 0.95 
Intervals are back-transformed from the log scale

这些是平均值，而不是求和值，但在其他方面它是有效的，并且转换不会像在对比度（）中那样被破坏
谢谢。第二种方法适用于我，但第一种方法不适用（这似乎更直观）-它似乎不会返回转换后的值：
(mod_em_inj <- emmeans(mod_inj, c("TX_GROUP","time"), type = "response"))

 TX_GROUP time     rate        SE  df asymp.LCL asymp.UCL
 0        1    5.743158 0.4566671 Inf  4.914366  6.711723
 1        1    5.529303 0.4639790 Inf  4.690766  6.517741
 0        2    2.444541 0.2981097 Inf  1.924837  3.104564
 1        2    1.467247 0.2307103 Inf  1.078103  1.996855
 0        3    4.570218 0.4121428 Inf  3.829795  5.453790
 1        3    1.676827 0.2472920 Inf  1.255904  2.238826


# Marginal means for combined period (7 - 24 months) - Method 1 
(mod_em_inj2 <- emmeans(mod_inj, c("TX_GROUP","time")))

 TX_GROUP time    emmean         SE  df  asymp.LCL asymp.UCL
 0        1    1.7480092 0.07951497 Inf 1.59216273 1.9038557
 1        1    1.7100619 0.08391274 Inf 1.54559591 1.8745278
 0        2    0.8938574 0.12194916 Inf 0.65484147 1.1328734
 1        2    0.3833880 0.15724024 Inf 0.07520279 0.6915732
 0        3    1.5195610 0.09018011 Inf 1.34281119 1.6963107
 1        3    0.5169035 0.14747615 Inf 0.22785558 0.8059515


contr = list(`2+3|0` = c(0, 0, 1, 0, 1, 0),
             `2+3|1` = c(0, 0, 0, 1, 0, 1))
summary(contrast(mod_em_inj2, contr), type = "response")

 contrast  estimate        SE  df z.ratio p.value
 2+3|0    2.4134184 0.1541715 Inf  15.654  <.0001
 2+3|1    0.9002915 0.2198023 Inf   4.096  <.0001


# Marginal means for combined period (7 - 24 months) - Method 2
mod_emmr = regrid(mod_em_inj) 
contrast(mod_emmr, contr)

 contrast estimate        SE  df z.ratio p.value
 2+3|0    7.014759 0.5169870 Inf  13.569  <.0001
 2+3|1    3.144075 0.3448274 Inf   9.118  <.0001

（mod_em_inj谢谢。第二种方法对我有效，但不是第一种（看起来更直观）-它似乎不会返回转换后的值：
(mod_em_inj <- emmeans(mod_inj, c("TX_GROUP","time"), type = "response"))

 TX_GROUP time     rate        SE  df asymp.LCL asymp.UCL
 0        1    5.743158 0.4566671 Inf  4.914366  6.711723
 1        1    5.529303 0.4639790 Inf  4.690766  6.517741
 0        2    2.444541 0.2981097 Inf  1.924837  3.104564
 1        2    1.467247 0.2307103 Inf  1.078103  1.996855
 0        3    4.570218 0.4121428 Inf  3.829795  5.453790
 1        3    1.676827 0.2472920 Inf  1.255904  2.238826


# Marginal means for combined period (7 - 24 months) - Method 1 
(mod_em_inj2 <- emmeans(mod_inj, c("TX_GROUP","time")))

 TX_GROUP time    emmean         SE  df  asymp.LCL asymp.UCL
 0        1    1.7480092 0.07951497 Inf 1.59216273 1.9038557
 1        1    1.7100619 0.08391274 Inf 1.54559591 1.8745278
 0        2    0.8938574 0.12194916 Inf 0.65484147 1.1328734
 1        2    0.3833880 0.15724024 Inf 0.07520279 0.6915732
 0        3    1.5195610 0.09018011 Inf 1.34281119 1.6963107
 1        3    0.5169035 0.14747615 Inf 0.22785558 0.8059515


contr = list(`2+3|0` = c(0, 0, 1, 0, 1, 0),
             `2+3|1` = c(0, 0, 0, 1, 0, 1))
summary(contrast(mod_em_inj2, contr), type = "response")

 contrast  estimate        SE  df z.ratio p.value
 2+3|0    2.4134184 0.1541715 Inf  15.654  <.0001
 2+3|1    0.9002915 0.2198023 Inf   4.096  <.0001


# Marginal means for combined period (7 - 24 months) - Method 2
mod_emmr = regrid(mod_em_inj) 
contrast(mod_emmr, contr)

 contrast estimate        SE  df z.ratio p.value
 2+3|0    7.014759 0.5169870 Inf  13.569  <.0001
 2+3|1    3.144075 0.3448274 Inf   9.118  <.0001

（mod_em_inj需要更多的细节。我们需要一个可复制的例子，也要看看你得到了什么。你的话中有一个不匹配的地方；你的方程有一个和，但你说你没有得到简单的平均值。这些论坛刺激了改进。例如，这个论坛建议对比（）
当?