R 在数据帧之间查找相等的行,包括NA作为值
我有两个数据帧:R 在数据帧之间查找相等的行,包括NA作为值,r,dataframe,missing-data,R,Dataframe,Missing Data,我有两个数据帧: df = structure(list(x = c(NA, NA, "b", "b", "b"), y = c("f", "f", "f", "g", "g")), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame")) df2 = structure(list(x = c(NA, NA, "a", "b", "b"), y = c("g", "f", "f", "g", "g")), row.
df = structure(list(x = c(NA, NA, "b", "b", "b"), y = c("f", "f",
"f", "g", "g")), row.names = c(NA, -5L), class = c("tbl_df",
"tbl", "data.frame"))
df2 = structure(list(x = c(NA, NA, "a", "b", "b"), y = c("g", "f",
"f", "g", "g")), row.names = c(NA, -5L), class = c("tbl_df",
"tbl", "data.frame"))
当考虑NA作为一个值时,我希望找到相同的行
df == df2
根据这一点,第二行应该是“TRUE”。相反,我们得到了NA。尽管逻辑很清楚,我们是否可以修改
df==df2
,使这些行被视为相等?您可以粘贴
并进行比较,即
do.call(paste, df) == do.call(paste, df2)
#[1] FALSE TRUE FALSE TRUE TRUE
一个选项是用数据集中不存在的值替换NA,进行比较,并检查所有行是否与
行和相等
rowSums(replace(df2, is.na(df2), "0") == replace(df, is.na(df), "0"))== 2
#[1] FALSE TRUE FALSE TRUE TRUE
或者在不替换的情况下,使用is.na创建逻辑条件
rowSums((!is.na(df) & df== df2)|(is.na(df))) == ncol(df)
尝试使用is.na(df)
但如何将其合并到df==df2?尝试rowsumes(replace(df2,is.na(df2),“0”)==replace(df,is.na(df),“0”)==2