四元变量R上的匹配
我正在使用您可以使用以下代码生成的数据集:四元变量R上的匹配,r,for-loop,dplyr,matching,R,For Loop,Dplyr,Matching,我正在使用您可以使用以下代码生成的数据集: set.seed(922) dat<-data.frame("y" = c(rnorm(500,20,2),rnorm(500, 40,2),rnorm(500,60,2),rnorm(500,80,2)), "quaternary" = sample(LETTERS[1:4],2000,replace = T), "binary" = sample(c("0","1"),2000,
set.seed(922)
dat<-data.frame("y" = c(rnorm(500,20,2),rnorm(500, 40,2),rnorm(500,60,2),rnorm(500,80,2)),
"quaternary" = sample(LETTERS[1:4],2000,replace = T),
"binary" = sample(c("0","1"),2000,replace = T))
但我不确定如何在“y”的值上建立4级因子“第四纪”的匹配
我不确定Matchit软件包中是否有一种优雅的方法可以做到这一点,但我愿意接受任何关于如何堆叠方法以获得良好平衡样本的建议。任何帮助都会很棒
编辑:
好吧,我想我很接近了。您可以在for循环中利用dplyr。它有点低效,我仍然需要考虑使用它来创建一个平衡样本的含义,但是它越来越近了
首先,在dat框架中,创建四个新变量,并填充NAs:
dat$A_match<-NA
dat$B_match<-NA
dat$C_match<-NA
dat$D_match<-NA
dat$A_match请记住,MatchIt只生成适合估计ATT的匹配样本(对治疗对象的平均治疗效果)。通常,MatchIt会选择一个它认为“已治疗”的组,该组通常是标记为“1”的治疗级别。然后将一个或多个控制单元与每个处理单元匹配
对于多项式处理,您还需要确定您感兴趣的估计。同样,如果您对ATT感兴趣,您必须选择一组被视为“治疗组”,其他组被视为“控制组”(我更愿意将其称为“焦点”和“非焦点”)。重要的是,您的治疗效果评估仅适用于与焦点群体组成相似的人群
如果这是您想要的,您需要选择一个组作为焦点,然后执行三个单独的matchit
调用,其中每个调用将非焦点组中的一个单元与焦点组中的单元进行匹配。协调小组保持不变。下面是一些我可能会使用的代码:
set.seed(922)
library(MatchIt)
dat<-data.frame("y" = c(rnorm(500,20,2),rnorm(500, 40,2),rnorm(500,60,2),rnorm(500,80,2)),
"quaternary" = sample(LETTERS[1:4],2000,replace = T, prob = c(.1, .3, .3, .3)),
"binary" = sample(c("0","1"),2000,replace = T))
focal <- "A"
dat$match.weights <- 1
for (lev in levels(dat$quaternary)) {
if (lev != focal) {
dat0 <- dat[dat$quaternary %in% c(focal, lev),]
dat0$treat <- as.numeric(dat0$quaternary == focal)
m.out <- matchit(treat ~ y, dat = dat0, replace = FALSE)
dat$match.weights[dat$quaternary == lev] <- m.out$weights[dat0$treat == 0]
}
}
library(cobalt)
bal.tab(quaternary ~ y, data = dat, weights = dat$match.weights,
method = "matching", focal = focal, un = TRUE)
#> Note: estimand and s.d.denom not specified; assuming ATT and treated.
#> Balance summary across all treatment pairs
#> Type Max.Diff.Un Max.Diff.Adj
#> y Contin. 0.1134 0.0009
#>
#> Sample sizes
#> B C D A
#> All 593 597 612 198
#> Matched 198 198 198 198
#> Unmatched 395 399 414 0
由reprex软件包(v0.2.1)于2018年10月13日创建
无论您使用何种策略,您都可以使用估计的匹配权重或ATT或ATE权重对治疗结果进行加权回归
[披露:我是cobalt和WeightIt软件包的作者。]Wow。感谢您的全面解释和您共享的代码。WeighIt软件包看起来很有希望。
require(dplyr) #haha. Hey that rhymes
for(i in 1:dim(dat)[1]){
dat_A_index<-dat%>%
mutate(y = ifelse(quaternary=="A",y,0),
abs = abs(dat[i,1]-y))%>%
summarise(A_index = which.min(abs))
dat$A_match[i]<-dat[dat_A_index$A_index,1]
rm(dat_A_index)
dat_B_index<-dat%>%
mutate(y = ifelse(quaternary=="B",y,0),
abs = abs(dat[i,1]-y))%>%
summarise(B_index = which.min(abs))
dat$B_match[i]<-dat[dat_B_index$B_index,1]
rm(dat_B_index)
dat_C_index<-dat%>%
mutate(y = ifelse(quaternary=="C",y,0),
abs = abs(dat[i,1]-y))%>%
summarise(C_index = which.min(abs))
dat$C_match[i]<-dat[dat_C_index$C_index,1]
rm(dat_C_index)
dat_D_index<-dat%>%
mutate(y = ifelse(quaternary=="D",y,0),
abs = abs(dat[i,1]-y))%>%
summarise(D_index = which.min(abs))
dat$D_match[i]<-dat[dat_D_index$D_index,1]
rm(dat_D_index)
}
set.seed(922)
library(MatchIt)
dat<-data.frame("y" = c(rnorm(500,20,2),rnorm(500, 40,2),rnorm(500,60,2),rnorm(500,80,2)),
"quaternary" = sample(LETTERS[1:4],2000,replace = T, prob = c(.1, .3, .3, .3)),
"binary" = sample(c("0","1"),2000,replace = T))
focal <- "A"
dat$match.weights <- 1
for (lev in levels(dat$quaternary)) {
if (lev != focal) {
dat0 <- dat[dat$quaternary %in% c(focal, lev),]
dat0$treat <- as.numeric(dat0$quaternary == focal)
m.out <- matchit(treat ~ y, dat = dat0, replace = FALSE)
dat$match.weights[dat$quaternary == lev] <- m.out$weights[dat0$treat == 0]
}
}
library(cobalt)
bal.tab(quaternary ~ y, data = dat, weights = dat$match.weights,
method = "matching", focal = focal, un = TRUE)
#> Note: estimand and s.d.denom not specified; assuming ATT and treated.
#> Balance summary across all treatment pairs
#> Type Max.Diff.Un Max.Diff.Adj
#> y Contin. 0.1134 0.0009
#>
#> Sample sizes
#> B C D A
#> All 593 597 612 198
#> Matched 198 198 198 198
#> Unmatched 395 399 414 0
library(WeightIt)
#Weighting for the ATT with A as focal:
w.out.att <- weightit(quaternary ~ y, data = dat, estimand = "ATT", focal = "A")
#> Using multinomial logit regression.
dat$w.att <- w.out.att$weights
#Weighting for the ATE:
w.out.ate <- weightit(quaternary ~ y, data = dat, estimand = "ATE")
#> Using multinomial logit regression.
dat$w.ate <- w.out.ate$weights
bal.tab(quaternary ~ y, data = dat, weights = c("w.att", "w.ate"),
method = "weighting", estimand = c("ATT", "ATE"), un = TRUE)
#> Balance summary across all treatment pairs
#> Type Max.Diff.Un Max.Diff.w.att Max.Diff.w.ate
#> y Contin. 0.1092 0.0055 0.0024
#>
#> Effective sample sizes
#> A B C D
#> All 198.000 593.000 597.000 612.000
#> w.att 198.000 591.139 593.474 604.162
#> w.ate 196.947 592.822 596.993 611.107