R 将具有交替缺失值的两个字符串列合并为一个_R_Na_Missing Data

R 将具有交替缺失值的两个字符串列合并为一个

R 将具有交替缺失值的两个字符串列合并为一个,r,na,missing-data,R,Na,Missing Data,我有一个数据框，其中有两列“a”和“b”，交替缺少值（NA）我尝试了merge和join，但都没有达到我想要的效果。也许是因为我没有可以合并的id？对于整数，我只想绕过这一点，同时添加两列，但在我的情况下如何呢我为这种类型的任务编写了一个函数，其工作原理与SQL coalesce函数非常相似。你会像这样使用它 dd<-read.table(text="a b dog NA mouse NA NA cat bird NA", header=T) dd$c &l

我有一个数据框，其中有两列“a”和“b”，交替缺少值（

NA

）

我尝试了

merge

和

join

，但都没有达到我想要的效果。也许是因为我没有可以合并的id？对于整数，我只想绕过这一点，同时添加两列，但在我的情况下如何呢

我为这种类型的任务编写了一个函数，其工作原理与SQL coalesce函数非常相似。你会像这样使用它

dd<-read.table(text="a      b
dog    NA
mouse  NA
NA   cat
bird   NA", header=T)

dd$c <- with(dd, coalesce(a,b))
dd
#       a    b     c
# 1   dog <NA>   dog
# 2 mouse <NA> mouse
# 3  <NA>  cat   cat
# 4  bird <NA>  bird

dd这是我的尝试（由@MrFlick修改）
df$c您可以使用一个简单的apply
：
df$c <- apply(df,1,function(x)  x[!is.na(x)]  ) 

> df
      a    b     c
1   dog <NA>   dog
2 mouse <NA> mouse
3  <NA>  cat   cat
4  bird <NA>  bird

df$c df
a、b、c
1只狗
2只老鼠
3只猫
4鸟
您可以尝试pmax

df$c <- pmax(df$a, df$b)
df
#       a    b     c
# 1   dog <NA>   dog
# 2 mouse <NA> mouse
# 3  <NA>  cat   cat
# 4  bird <NA>  bird

df$c另一个选项是将与arr.ind=TRUE

indx <- which(!is.na(df), arr.ind=TRUE)
df$c <-  df[indx][order(indx[,1])]
df
#    a    b     c
#1   dog <NA>   dog
#2 mouse <NA> mouse
#3  <NA>  cat   cat
#4  bird <NA>  bird

indxdpyr
完全符合您的要求，函数coalesce（）

库（dplyr）
a使用if-else逻辑：
a<-c("dog","mouse",NA,"bird")
b<-c(NA,NA,"cat",NA)

test.df <-data.frame(a,b, stringsAsFactors = FALSE)
test.df$c <- ifelse(is.na(test.df$a), test.df$b, test.df$a)

test.df

      a    b     c
1   dog <NA>   dog
2 mouse <NA> mouse
3  <NA>  cat   cat
4  bird <NA>  bird

a那些是真的NA
值还是假的？不会应用（df，1，函数（x）NA.omit（x）[1]）
在这里工作得同样好，而且更简单一点？我也会使用df[which（！is.NA（df），arr.ind=TRUE）]
@akrun，这是非常好的向量化方法。我会将其作为您自己的答案发布。对我来说，最好的解决方案是使用ifelse的第二个选项。谢谢
df$c <- pmax(df$a, df$b)
df
#       a    b     c
# 1   dog <NA>   dog
# 2 mouse <NA> mouse
# 3  <NA>  cat   cat
# 4  bird <NA>  bird

indx <- which(!is.na(df), arr.ind=TRUE)
df$c <-  df[indx][order(indx[,1])]
df
#    a    b     c
#1   dog <NA>   dog
#2 mouse <NA> mouse
#3  <NA>  cat   cat
#4  bird <NA>  bird

df$c <- df[cbind(1:nrow(df),max.col(!is.na(df)))]

library(dplyr)

a<-c("dog","mouse",NA,"bird")
b<-c(NA,NA,"cat",NA)

coalesce(a,b)

[1] "dog"   "mouse" "cat"   "bird"

a<-c("dog","mouse",NA,"bird")
b<-c(NA,NA,"cat",NA)

test.df <-data.frame(a,b, stringsAsFactors = FALSE)
test.df$c <- ifelse(is.na(test.df$a), test.df$b, test.df$a)

test.df

      a    b     c
1   dog <NA>   dog
2 mouse <NA> mouse
3  <NA>  cat   cat
4  bird <NA>  bird