R：使用MatchIt进行倾向评分匹配。如何找到replace=TRUE的匹配观察数？请考虑以下事项：_R_Matching_Nearest Neighbor_Propensity Score Matching

R：使用MatchIt进行倾向评分匹配。如何找到replace=TRUE的匹配观察数？请考虑以下事项：

R：使用MatchIt进行倾向评分匹配。如何找到replace=TRUE的匹配观察数？请考虑以下事项：,r,matching,nearest-neighbor,propensity-score-matching,R,Matching,Nearest Neighbor,Propensity Score Matching,我正在用R中的MatchIt包匹配数据。我的控件比处理的控件少，并使用选项replace=TRUE。根据，权重告诉我们匹配控件的频率从手册中： “对于与替换匹配，请使用replace=TRUE。与替换匹配后，权重可用于反映频率每个控制单元都与之匹配。” 然而，我不明白为什么权重可以有小数，以及它如何反映频率例如，我在手册中的示例中添加了replace==TRUE（参见第18页）：库（“dplyr”）图书馆（“匹配”） m、 out1治疗年龄educ black hispan与nodeg

我正在用R中的

MatchIt

包匹配数据。我的控件比处理的控件少，并使用选项

replace=TRUE

。根据，权重告诉我们匹配控件的频率

从手册中：

“对于与替换匹配，请使用

replace=TRUE

。与替换匹配后，权重可用于反映频率每个控制单元都与之匹配。”

然而，我不明白为什么权重可以有小数，以及它如何反映频率

例如，我在手册中的示例中添加了

replace==TRUE

（参见第18页）：

库（“dplyr”）
图书馆（“匹配”）
m、 out1治疗年龄educ black hispan与nodegree re74 re75 re78结婚
#>PSID388 0 19 11 1 0 1 0 16485.520
#>PSID3900481300.000
#>PSID392 017100.000
#>PSID393 0 38 12 0 1 0 0 0 18756.780
#>PSID396 0 48 14 0 0 1 0 0 7236.427
#>PSID398 0 17 8 1 0 0 1 0 4520.366
#>PSID400 0 37 8 1 0 0 1 0 648.722
#>PSID40101710053.619
#>PSID407 0 23 12 0 0 0 0 3902.676
#>PSID409 0 17 10 0 0 1 0 14942.770
#>PSID411 0 18 10 1 0 1 0 5306.516
#>PSID413 0 17 10 0 1 0 3859.822
#>PSID419 0 51 41 0 1 0 0 0.000
#>PSID423 0 27 10 1 0 1 0 7543.794
#>PSID4250181010150.500
#>距离权重
#>PSID388 0.4067545 0.6
#>PSID390 0.4042321 1.2
#>PSID392 0.3974677 0.6
#>PSID393 0.4016920 4.2
#>PSID396 0.4152715 0.6
#>PSID398 0.3758217 1.8
#>PSID400 0.3595084 0.6
#>PSID401 0.3974677 1.2
#>PSID407 0.4144044 1.8
#>PSID409 0.3974677 0.6
#>PSID411 0.3966277 1.2
#>PSID413 0.3974677 1.2
#>PSID419 0.3080590 0.6
#>PSID423 0.3890954 1.2
#>PSID425 0.4076015 1.2

对照品“PSID393”的重量为4.276。因此，我假设该控件匹配4或5次（四舍五入后）

但是，我们也可以查看

match.matrix

以逐个查看匹配的治疗和对照。过滤“PSID393”，我们看到控件实际上已经匹配了7次：

m.out1$match.matrix%%>%data.frame（）%%>%filter（X1==“PSID393”）
#>X1
#>1磅/平方英寸393
#>2磅/平方英寸393
#>3磅/平方英寸393
#>4磅/平方英寸393
#>5磅/平方英寸393
#>6磅/平方英寸393
#>7磅/平方英寸393

由（v0.2.1）于2019-05-06创建

如何正确解释这两个输出？

对权重进行缩放，使其与对照组中唯一匹配的观察数相加。使用示例数据，请注意权重之和等于观察次数，平均权重为1。此外，使用最多的观察值的权重是使用最少的观察值的七倍：

要查看权重的分布，我们可以：

match.data(m.out1) %>% 
  group_by(treat, weights) %>% 
  tally %>% 
  group_by(treat) %>% 
  mutate(weight.ratio = weights/min(weights))

在文章的结尾有一个常见问题。第5.3项“如何准确创建权重？”指出“对照组权重按比例累加至唯一匹配的对照组数量单位。”

  treat min.weight max.weight mean.weight sum.weights     n max.match.ratio
1     0      0.605       4.24           1         112   112               7
2     1      1           1              1         185   185               1

match.data(m.out1) %>% 
  group_by(treat, weights) %>% 
  tally %>% 
  group_by(treat) %>% 
  mutate(weight.ratio = weights/min(weights))

  treat weights     n weight.ratio
1     0   0.605    74            1
2     0   1.21     19            2
3     0   1.82     10            3
4     0   2.42      6            4
5     0   3.63      2            6
6     0   4.24      1            7
7     1   1       185            1