如何根据变量是否出现在另一个数据帧中在R中创建另一列_R

如何根据变量是否出现在另一个数据帧中在R中创建另一列

如何根据变量是否出现在另一个数据帧中在R中创建另一列,r,R,我有两个看起来类似的数据帧： >health ID Stroke Diab MI Age Sex 1 1 0 0 0 65 M 2 2 0 0 0 66 M 3 3 1 0 0 78 F 4 4 0 0 0 55 M 5 5 0 0 0 67 M 6 6 1 1 1 66 M 7 7 0 0 0

我有两个看起来类似的数据帧：

>health
   ID Stroke Diab MI Age Sex
1   1      0    0  0  65   M
2   2      0    0  0  66   M
3   3      1    0  0  78   F
4   4      0    0  0  55   M
5   5      0    0  0  67   M
6   6      1    1  1  66   M
7   7      0    0  0  79   F
8   8      0    0  0  54   M
9   9      0    0  0  65   F
10 10      1    1  1  78   F

>Asthma
   ID Smoker Smoking_Status
1  12      2              0
2  15      0              1
3  24      1              0
4   2      2              1
5   8      2              0
6  53      1              1
7  10      0              0
8  32      0              0
9   1      0              0
10  5      1              1

这些是我用来生成这些示例表的代码

health <- data.frame(ID=c(1,2,3,4,5,6,7,8,9,10), Stroke = factor(c(0,0,1,0,0,1,0,0,0,1)), 
                     Diab = factor(c(0,0,0,0,0,1,0,0,0,1)), MI = factor(c(0,0,0,0,0,1,0,0,0,1)),
                     Age = factor(c(65,66,78,55,67,66,79,54,65,78)), 
                     Sex = factor(c("M","M","F","M","M","M","F","M","F","F")))

Asthma <- data.frame(ID=c(12,15,24,2,8,53,10,32,1,5), Smoker = factor(c(2,0,1,2,2,1,0,0,0,1)), 
                     Smoking_Status = factor(c(0,1,0,1,0,1,0,0,0,1)))

许多可能的方法之一：

health$asthma =match(x = health$ID,table = Asthma$ID,nomatch = 0)
health$asthma = replace(x = health$asthma,list = which(health$asthma>0),values = 1)

使用

数据。表：
health = as.data.table(x = health)
Asthma = as.data.table(x = Asthma)
health[,`:=`(asthma = numeric(nrow(health)))]
set(x = health,i = which(health$ID %in% Asthma$ID),j = "asthma",value = 1)


#> health
#    ID Stroke Diab MI Age Sex asthma
# 1:  1      0    0  0  65   M    1
# 2:  2      0    0  0  66   M    1
# 3:  3      1    0  0  78   F    0
# 4:  4      0    0  0  55   M    0
# 5:  5      0    0  0  67   M    1
# 6:  6      1    1  1  66   M    0
# 7:  7      0    0  0  79   F    0
# 8:  8      0    0  0  54   M    1
# 9:  9      0    0  0  65   F    0
#10: 10      1    1  1  78   F    1

您可以使用data.table
package在一行中完成此操作-
> data.table::setDT(health)[,ind:=ifelse(ID %in% Asthma$ID,1,0)]
> health

    ID Stroke Diab MI Age Sex id_app ind
 1:  1      0    0  0  65   M      1   1
 2:  2      0    0  0  66   M      1   1
 3:  3      1    0  0  78   F      0   0
 4:  4      0    0  0  55   M      0   0
 5:  5      0    0  0  67   M      1   1
 6:  6      1    1  1  66   M      0   0
 7:  7      0    0  0  79   F      0   0
 8:  8      0    0  0  54   M      1   1
 9:  9      0    0  0  65   F      0   0
10: 10      1    1  1  78   F      1   1

as.integer（健康$ID%在%s$ID中）
或健康$ID
> data.table::setDT(health)[,ind:=ifelse(ID %in% Asthma$ID,1,0)]
> health

    ID Stroke Diab MI Age Sex id_app ind
 1:  1      0    0  0  65   M      1   1
 2:  2      0    0  0  66   M      1   1
 3:  3      1    0  0  78   F      0   0
 4:  4      0    0  0  55   M      0   0
 5:  5      0    0  0  67   M      1   1
 6:  6      1    1  1  66   M      0   0
 7:  7      0    0  0  79   F      0   0
 8:  8      0    0  0  54   M      1   1
 9:  9      0    0  0  65   F      0   0
10: 10      1    1  1  78   F      1   1