R 用另一个数据帧的NA值替换数据帧的值

R 用另一个数据帧的NA值替换数据帧的值,r,R,我想用具有相同标识符的另一个数据帧的NA替换一个数据帧的值。也就是说,对于具有相同id的df1的所有值,在相应的id和索引处分配df2的NA值 我有df1和df2: df1 =data.frame(id = c(1,1,2,2,6,6),a = c(2,4,1,7,5,3), b = c(5,3,0,3,2,5),c = c(9,3,10,33,2,5)) df2 =data.frame(id = c(1,2,6),a = c("NA",0,"NA"), b= c("NA", 9, 9),c=c

我想用具有相同标识符的另一个数据帧的NA替换一个数据帧的值。也就是说,对于具有相同id的df1的所有值,在相应的id和索引处分配df2的NA值

我有df1和df2:

df1 =data.frame(id = c(1,1,2,2,6,6),a = c(2,4,1,7,5,3), b = c(5,3,0,3,2,5),c = c(9,3,10,33,2,5))
df2 =data.frame(id = c(1,2,6),a = c("NA",0,"NA"), b= c("NA", 9, 9),c=c(0,"NA","NA"))
我想要的是df3:

df3 = data.frame(id = c(1,1,2,2,6,6),a = c("NA","NA",1,7,"NA","NA"), b = c("NA","NA",0,3,2,5),c = c(9,3,"NA","NA","NA","NA"))
我已经尝试了lookup函数和library data.table,但是我可以得到正确的df3。有人能帮我一下吗?

我们可以在“id”上进行连接,然后将NA值乘以

library(data.table)
nm1 <- names(df1)[-1]
setDT(df1)[df2,  (nm1) := Map(function(x, y) x*(NA^is.na(y)), .SD, 
                  mget(paste0('i.', nm1))), on = .(id), .SDcols = nm1]
df1
#   id  a  b  c
#1:  1 NA NA  9
#2:  1 NA NA  3
#3:  2  1  0 NA
#4:  2  7  3 NA
#5:  6 NA  2 NA
#6:  6 NA  5 NA

注意:在OP的post NA WARE NA

中,由于NA值实际上是文本NA,因此必须将所有变量转换为带有as.character的文本。您可以按id列联接两个数据帧。由于两个数据帧都有a、b和c列,因此R将重命名a.x、b.x和c.x df1以及a.y、b.y和c.y df2。 之后,您可以创建新的列a、b和c。当a.y==NA时,它们就有NA,反之亦然,依此类推。如果您的NA值是真实的NA,则需要以不同的方式测试is.navalue,请参见下面代码中的示例

library(dplyr)

df1 %>%  
  mutate_all(as.character) %>% # allvariables as text
  left_join(df2 %>% 
              mutate_all(as.character) ## all variables as text
            , by = "id") %>% ## join tables by 'id'; a.x from df1 and a.y from df2 and so on
  mutate(a = case_when(a.y == "NA" ~ "NA", TRUE ~ a.x), ## if a.y == "NA" take this,else  a.x 
         b = case_when(b.y == "NA" ~ "NA", TRUE  ~ b.x),
         c = case_when(c.y == "NA" ~ "NA", TRUE ~ c.x)) %>%
  select(id, a, b, c) ## keep only these initial columns

  id  a  b  c
1  1 NA NA  9
2  1 NA NA  3
3  2  1  0 NA
4  2  7  3 NA
5  6 NA  2 NA
6  6 NA  5 NA

##if your dataframe head real NA this is how you can test:
missing_value <- NA

is.na(missing_value) ## TRUE
missing_value == NA  ## Does not work with R
library(dplyr)

df1 %>%  
  mutate_all(as.character) %>% # allvariables as text
  left_join(df2 %>% 
              mutate_all(as.character) ## all variables as text
            , by = "id") %>% ## join tables by 'id'; a.x from df1 and a.y from df2 and so on
  mutate(a = case_when(a.y == "NA" ~ "NA", TRUE ~ a.x), ## if a.y == "NA" take this,else  a.x 
         b = case_when(b.y == "NA" ~ "NA", TRUE  ~ b.x),
         c = case_when(c.y == "NA" ~ "NA", TRUE ~ c.x)) %>%
  select(id, a, b, c) ## keep only these initial columns

  id  a  b  c
1  1 NA NA  9
2  1 NA NA  3
3  2  1  0 NA
4  2  7  3 NA
5  6 NA  2 NA
6  6 NA  5 NA

##if your dataframe head real NA this is how you can test:
missing_value <- NA

is.na(missing_value) ## TRUE
missing_value == NA  ## Does not work with R