R 精确的年龄匹配匹配匹配它不起作用。_R_Match_Matching

R 精确的年龄匹配匹配匹配它不起作用。

R 精确的年龄匹配匹配匹配它不起作用。,r,match,matching,R,Match,Matching,我正在尝试按年龄进行病例对照精确匹配。我的数据库由75名患者的139只眼组成，通过二分法变量G6PDcarente=0/1分为2组我正在尝试与代码进行匹配： match.it <- matchit(G6PDcarente~age, data = newdata, method="exact",ratio=1,replace=FALSE) match.it match.it <- matchit(G6PDcarente~age, data = newdata, method="n

我正在尝试按年龄进行病例对照精确匹配。我的数据库由75名患者的139只眼组成，通过二分法变量G6PDcarente=0/1分为2组

我正在尝试与代码进行匹配：

match.it <- matchit(G6PDcarente~age, data = newdata, method="exact",ratio=1,replace=FALSE)
match.it

match.it <- matchit(G6PDcarente~age, data = newdata, method="nearest",exact="age",ratio=1, replace=FALSE)

为什么配对的样本大小如此不同？对照组和治疗组的匹配样本（如31-31）不应该相同吗？在两组样本量相同的情况下，如何获得年龄上的精确匹配

我还尝试了以下代码：

match.it <- matchit(G6PDcarente~age, data = newdata, method="exact",ratio=1,replace=FALSE)
match.it

match.it <- matchit(G6PDcarente~age, data = newdata, method="nearest",exact="age",ratio=1, replace=FALSE)

有人能帮我吗

谢谢

下面是复制我的数据样本的代码：

newdata <- structure(list(NumeroProgressivo = c(43, 44, 137, 138, 129, 130, 
65, 111, 148, 149, 35, 36, 83, 84, 37, 38, 127, 128, 160, 161, 
75, 76, 53, 54, 119, 120, 109, 110, 57, 58, 39, 51, 52, 29, 30, 
71, 72, 154, 155, 77, 78, 1, 2, 61, 62, 158, 101, 102, 27, 28, 
73, 103, 104, 121, 122, 152, 153, 107, 108, 45, 46, 81, 82, 139, 
140, 59, 60, 95, 96, 33, 34, 91, 92, 26, 49, 50, 79, 6, 63, 64, 
15, 16, 31, 32, 143, 144, 69, 70, 89, 90, 41, 42, 17, 18, 67, 
68, 115, 116, 150, 151, 97, 98, 93, 94, 135, 136, 55, 56, 131, 
132, 162, 163, 21, 22, 23, 24, 156, 157, 133, 166, 174, 175, 
164, 165, 172, 173, 176, 177), IDpaziente = c(22, 22, 67, 67, 
63, 63, 33, 56, 73, 73, 18, 18, 42, 42, 19, 19, 62, 62, 79, 79, 
38, 38, 27, 27, 60, 60, 55, 55, 29, 29, 20, 26, 26, 15, 15, 36, 
36, 76, 76, 39, 39, 1, 1, 31, 31, 78, 51, 51, 14, 14, 37, 52, 
52, 61, 61, 75, 75, 54, 54, 23, 23, 41, 41, 68, 68, 30, 30, 48, 
48, 17, 17, 46, 46, 13, 25, 25, 40, 3, 32, 32, 8, 8, 16, 16, 
70, 70, 35, 35, 45, 45, 21, 21, 9, 9, 34, 34, 58, 58, 74, 74, 
49, 49, 47, 47, 66, 66, 28, 28, 64, 64, 80, 80, 11, 11, 12, 12, 
77, 77, 65, 82, 86, 86, 81, 81, 85, 85, 87, 87), Occhio = c("OD", 
"OS", "OD", "OS", "OD", "OS", "OD", "OD", "OD", "OS", "OD", "OS", 
"OD", "OS", "OD", "OS", "OD", "OS", "OD", "OS", "OD", "OS", "OD", 
"OS", "OD", "OS", "OD", "OS", "OD", "OS", "OD", "OD", "OS", "OD", 
"OS", "OD", "OS", "OD", "OS", "OD", "OS", "OD", "OS", "OD", "OS", 
"OD", "OD", "OS", "OD", "OS", "OD", "OD", "OS", "OD", "OS", "OD", 
"OS", "OD", "OS", "OD", "OS", "OD", "OS", "OD", "OS", "OD", "OS", 
"OD", "OS", "OD", "OS", "OD", "OS", "OS", "OD", "OS", "OD", "OS", 
"OD", "OS", "OD", "OS", "OD", "OS", "OD", "OS", "OD", "OS", "OD", 
"OS", "OD", "OS", "OD", "OS", "OD", "OS", "OD", "OS", "OD", "OS", 
"OD", "OS", "OD", "OS", "OD", "OS", "OD", "OS", "OD", "OS", "OD", 
"OS", "OD", "OS", "OD", "OS", "OD", "OS", "OD", "OD", "OD", "OS", 
"OD", "OS", "OD", "OS", "OD", "OS"), G6PDcarente = c(0, 0, 0, 
0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 
0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 
0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), 
    age = c(70, 70, 38, 38, 54, 54, 41, 74, 31, 31, 27, 27, 36, 
    36, 36, 36, 49, 49, 34, 34, 49, 49, 34, 34, 33, 33, 34, 34, 
    38, 38, 62, 30, 30, 38, 38, 53, 53, 27, 27, 57, 57, 84, 84, 
    25, 25, 26, 57, 57, 47, 47, 29, 31, 31, 26, 26, 23, 23, 34, 
    34, 48, 48, 34, 34, 34, 34, 40, 40, 45, 45, 33, 33, 61, 61, 
    73, 32, 32, 67, 80, 39, 39, 67, 67, 37, 37, 28, 28, 26, 26, 
    32, 32, 24, 24, 61, 61, 36, 36, 66, 66, 26, 26, 35, 35, 39, 
    39, 32, 32, 39, 39, 39, 39, 42, 42, 35, 35, 64, 64, 34, 34, 
    37, 61, 80, 80, 74, 74, 62, 62, 71, 71)), row.names = c(NA, 
-128L), class = c("tbl_df", "tbl", "data.frame"))

分配给对照组/治疗组的观察数量正是它们应该的数量，因为分配是基于G6PDcarente变量中的值

从帮助文件？匹配它：

对于函数中的第一个参数，请将此参数公式化采用R公式的常用语法，treat~x1+x2，其中treat 是二元治疗指示剂，x1和x2是二元治疗指示剂治疗前协变量

在您的例子中，公式对应于G6PDcarente~age，其中G6PDcarente==1的观察数与G6PDcarente==0的观察数不同

我们可以通过手动检查直接验证，因为数据集不是很大：

library(dplyr)
library(tidyr)

new.data.check <- newdata %>% 
  count(age, G6PDcarente) %>% # count all unique combinations of age & G6PDcarente
  spread(G6PDcarente, n) %>%  # create separate columns for G6PDcarente == 0 / == 1
  na.omit()                   # remove NA rows, where a specific age only has G6PDCarente == 0
                              # OR G6PDCarente == 1, but not both (i.e. unmatched samples)

> new.data.check    
# A tibble: 14 x 3
     age   `0`   `1`
   <dbl> <int> <int>
 1    26     3     4
 2    27     2     2
 3    31     2     2
 4    32     2     4
 5    34     6     8
 6    37     1     2
 7    38     2     4
 8    39     2     6
 9    49     2     2
10    61     1     4
11    62     2     1
12    67     2     1
13    74     2     1
14    80     2     1

由于不知道您的具体使用情况，我想如果您真的希望治疗和控制的数量相同，您可以随时放弃一些观察…

感谢@Z.Lin回复，我已经找到了解决问题的方法

下面是我按照本手册的说明使用的代码：

> colSums(new.data.check)
age   0   1 
657  31  42

    OCTA.Filtered = as.data.frame(na.omit(OCTA.Filtered)) 
    m.out.test = matchit(G6PDcarente~age,method="nearest", data=OCTA.Filtered, ratio = 1)
    test_data = match.data(m.out.test) 
    ps.sd = sd(test_data$distance)
    # matching is performed below using propensity scores given the covariates mentioned below
    # caliper = 0.25 times sd of propensity scores (optimal)
    m.out = matchit(G6PDcarente~age,method="nearest", data=OCTA.Filtered, caliper = 0.25*ps.sd)
    # check the sample sizes (below)
    m.out 
    # Final matched data saved as final_data
    final_data = match.data(m.out) 
    # (here distance = propensity score)
new.data.check <- final_data %>% 
+   count(age, G6PDcarente) %>% # count all unique combinations of age & G6PDcarente
+   spread(G6PDcarente, n) %>%  # create separate columns for G6PDcarente == 0 / == 1
+   na.omit()
> new.data.check
# A tibble: 14 x 3
     age   `0`   `1`
   <dbl> <int> <int>
 1    26     3     3
 2    27     2     2
 3    31     2     2
 4    32     2     2
 5    34     6     6
 6    37     1     1
 7    38     2     2
 8    39     2     2
 9    49     2     2
10    61     1     1
11    62     1     1
12    67     1     1
13    74     1     1
14    80     1     1