R 如何使用multcomp创建差异对比的差异

R 如何使用multcomp创建差异对比的差异,r,R,我正在R中建立一个线性模型,我有具体的计划对比,我正在尝试实现。该模型包含一个交互项。我相信我知道如何正确指定glht()所采用的矩阵,以便为对比度提供估计值。但是,我不确定如何在矩阵中指定交互计划对比。以下是一个可复制的示例: library(multcomp) # Create minimal dataset set.seed(100) df <- data.frame(response = rnorm(2000), age = sample(1

我正在R中建立一个线性模型,我有具体的计划对比,我正在尝试实现。该模型包含一个交互项。我相信我知道如何正确指定
glht()
所采用的矩阵,以便为对比度提供估计值。但是,我不确定如何在矩阵中指定交互计划对比。以下是一个可复制的示例:

library(multcomp)

# Create minimal dataset

set.seed(100)
df <- data.frame(response = rnorm(2000), 
                 age = sample(15:60, 2000, replace = TRUE),
                 sex = rep(c("Male", "Female"), each = 1000), 
                 ses = rep(c("L", "LM", "MH", "H"), 500))

# Basic linear model
m1 <- lm(response ~ age + sex * ses, 
         data = df)

summary(m1)

Call:
lm(formula = response ~ age + sex * ses, data = df)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.3423 -0.6697  0.0044  0.6716  3.2542 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)
(Intercept)   -0.0514746  0.0889766  -0.579    0.563
age           -0.0003435  0.0016934  -0.203    0.839
sexMale        0.0819362  0.0900868   0.910    0.363
sesL           0.0769168  0.0901245   0.853    0.394
sesLM          0.1289027  0.0901339   1.430    0.153
sesMH          0.0681291  0.0900821   0.756    0.450
sexMale:sesL  -0.0661439  0.1274402  -0.519    0.604
sexMale:sesLM -0.0925924  0.1275157  -0.726    0.468
sexMale:sesMH -0.1188700  0.1273978  -0.933    0.351

Residual standard error: 1.007 on 1991 degrees of freedom
Multiple R-squared:  0.001591,  Adjusted R-squared:  -0.002421 
F-statistic: 0.3966 on 8 and 1991 DF,  p-value: 0.9229

# Create a matrix of linear combinations
# Here 'H' and 'Female' are the reference categories, 

lc <- rbind("Female:H" = c(1, 0, 0, 0, 0, 0, 0, 0, 0),
            "Female:L" = c(1, 0, 0, 1, 0, 0, 0, 0, 0),
            "Female:LM" = c(1, 0, 0, 0, 1, 0, 0, 0, 0),
            "Female:MH" = c(1, 0, 0, 0, 0, 1, 0, 0, 0),
            "Male vs Female difference in L vs H" = c(1, 0, 0, 0, 0, 0, 1, 0, 0),
            "Male vs Female difference in LM vs H" = c(1, 0, 0, 0, 0, 0, 0, 1, 0),
            "Male vs Female difference in MH vs H" = c(1, 0, 0, 0, 0, 0, 0, 0, 1))

# Create matrix for the linfct argument in glht()
# One contrast compares the LM to the L category in females
# Another contrast compares the pooled LM and L categories to the H category in females

k <- rbind("Females: LM vs. L" = 
             lc["Female:LM", ] - lc["Female:L", ],
           "Females: Pooled L and LM vs H" =  
             ((lc["Female:LM", ] + lc["Female:L", ]) / 2) - lc["Female:H", ])

c1 <- glht(m1, linfct = k)
s1 <- summary(c1, adjusted(type = "none"))

confint(c1, adjusted(type = "none"))
库(multcomp)
#创建最小数据集
种子集(100)

不幸的是,我不认为你的
lc
k
是正确的。矩阵的列数对应于
名称(m1$coef)
的元素数。
[1]拦截,[2]年龄,[3]性别男性,[4]性别男性,[5]性别男性,[6]性别男性,[7]性别男性:性别女性,[8]性别男性:性别男性,[9]性别男性:性别男性

所以,
“女性:L”=c(1,0,1,0,0,0,0,0)
意味着
“男性:H”
“女性:LM”=c(1,0,0,1,0,0,0,0)
意味着
女性:L
。您将这些差异用作
k[,1]
。请注意
s1
的估价

s1
# Linear Hypotheses:
#                                     Estimate Std. Error t value Pr(>|t|)
# Females: LM vs. L == 0             -0.005019   0.090091  -0.056    0.956
# Females: Pooled L and LM vs H == 0  0.079427   0.078038   1.018    0.309

# In fact, you did "sexMale:H vs Female:L".
m1$coef["sesL"] - m1$coef["sexMale"]  # -0.005019421
你可以通过你链接的帖子显示的方法得到正确的矩阵。
group <- paste0(df$sex, df$ses)
group <- aggregate(model.matrix(m1) ~ group, FUN=mean)
rownames(group) <- group$group
group <- group[,-1]
group$age <- 0
lc2 <- as.matrix(group)
rownames(lc2)  <- c("Female:H", "Female:L", "Female:LM", "Female:MH", 
                    "Male:H", "Male:L", "Male:LM", "Male:MH")

lc2
          (Intercept) age sexMale sesL sesLM sesMH sexMale:sesL sexMale:sesLM sexMale:sesMH
Female:H            1   0       0    0     0     0            0             0             0
Female:L            1   0       0    1     0     0            0             0             0
Female:LM           1   0       0    0     1     0            0             0             0
Female:MH           1   0       0    0     0     1            0             0             0
Male:H              1   0       1    0     0     0            0             0             0
Male:L              1   0       1    1     0     0            1             0             0
Male:LM             1   0       1    0     1     0            0             1             0
Male:MH             1   0       1    0     0     1            0             0             1
(注意:差异没有截距,因为它是一个常用术语)


这似乎仍然是一个统计问题,而不是编程问题。如果你确切地知道对比度应该是什么样子,但不知道如何创建对象,那是一回事,但如果你不知道如何使用对比度来实际检验某些假设,这似乎是一个更好的问题,我不知道你为什么在这里问这个问题。我是专门寻找代码的,这就是为什么我问StackOverflow而不是CrossValidated。不过,老实说,这可能是一个灰色地带,因为我也不太了解如何从理论上做一些事情。你说得对,我的“0”已经关闭了!谢谢你接电话,我太马虎了。你这样做肯定更好。我很感激你的代码,但是我想我可能不太清楚。我实际上是在寻找代码来创建新的交互术语,而不是你给我的主要效果。据我所知,你编程的效果给了我男性的L和M的效果。然而,我想知道L和M在男性和女性中的“额外差异”。我将编辑我的原始帖子,并重新措辞以使其更清楚。真是太感谢你了!我一直在想方设法想办法。现在我明白你所做的一切了。@RNB;我很高兴能为您提供帮助(我编辑了一些打字错误)。
k2 <- rbind("Females: LM vs. L" = 
              lc2["Female:LM", ] - lc2["Female:L", ],
            "Females: Pooled L and LM vs H" =  
              ((lc2["Female:LM", ] + lc2["Female:L", ]) / 2) - lc2["Female:H", ])

c1_2 <- glht(m1, linfct = k2)
s1_2 <- summary(c1_2, adjusted(type = "none"))

s1_2
# Linear Hypotheses:
#                                    Estimate Std. Error t value Pr(>|t|)
# Females: LM vs. L == 0              0.05199    0.09008   0.577    0.564
# Females: Pooled L and LM vs H == 0  0.10291    0.07807   1.318    0.188

m1$coef["sesLM"] - m1$coef["sesL"]       # 0.05198589
(m1$coef["sesL"] + m1$coef["sesLM"])/2   # 0.1029097 
m2 <- lm(response ~ age + ses + sex:ses , data = df)
summary(m2) # excerpt
# sesH:sexMale   0.0819362  0.0900868   0.910    0.363

summary(m1) # excerpt
# sexMale        0.0819362  0.0900868   0.910    0.363
k3 <- rbind(
  "Male vs Female difference in L vs H"  = 
    (lc2["Male:L",] - lc2["Male:H",]) - (lc2["Female:L",] - lc2["Female:H",]),
  "Male vs Female difference in LM vs H" = 
    (lc2["Male:LM",] - lc2["Male:H",]) - (lc2["Female:LM",] - lc2["Female:H",]),
  "Male vs Female difference in MH vs H" = 
    (lc2["Male:MH",] - lc2["Male:H",]) - (lc2["Female:MH",] - lc2["Female:H",]),
  # above is modified lc[5:7, ], below is what you want
  "Male vs Female difference in LM vs L" = 
    (lc2["Male:LM",] - lc2["Male:L",]) - (lc2["Female:LM",] - lc2["Female:L",]),
  "Male vs Female difference in Pooled L and LM vs H" = 
    ((lc2["Male:L", ] + lc2["Male:LM", ]) / 2 - lc2["Male:H", ])
      - ((lc2["Female:L", ] + lc2["Female:LM", ]) / 2 - lc2["Female:H", ])
)

k3   # I deleted some character, space, digit because of the space and viewability
                 (Intercept) age sexM  sesL sesLM  sesMH sexM:sesL sexM:sesLM sexM:sesMH
M vs F diff in L vs H      0   0    0     0     0      0         1          0          0
M vs F diff in LM vs H     0   0    0     0     0      0         0          1          0
M vs F diff in MH vs H     0   0    0     0     0      0         0          0          1
M vs F diff in LM vs L     0   0    0     0     0      0        -1          1          0
MvsF diff in L&LM vs H     0   0    0     0     0      0       0.5        0.5          0