R 如何替换NA';数据帧行中的,其中的行不全是NA';s
我有一个看起来像这样的数据帧:R 如何替换NA';数据帧行中的,其中的行不全是NA';s,r,R,我有一个看起来像这样的数据帧: df <- data.frame(matrix(c(1,351,NA,1,0,2,585,0,1,1,3,321,NA,0,1,4,964,NA,NA,NA,5,556,0,1,NA), ncol = 5, byrow = TRUE)) colnames(df) <- c('id','value','v1','v2','v3') R> df id value v1 v2 v3 1 1 351 0 1 0 2 2 585
df <- data.frame(matrix(c(1,351,NA,1,0,2,585,0,1,1,3,321,NA,0,1,4,964,NA,NA,NA,5,556,0,1,NA), ncol = 5, byrow = TRUE))
colnames(df) <- c('id','value','v1','v2','v3')
R> df
id value v1 v2 v3
1 1 351 0 1 0
2 2 585 0 1 1
3 3 321 0 0 1
4 4 964 NA NA NA
5 5 556 0 1 0
结果是这样的:
df <- data.frame(matrix(c(1,351,NA,1,0,2,585,0,1,1,3,321,NA,0,1,4,964,NA,NA,NA,5,556,0,1,NA), ncol = 5, byrow = TRUE))
colnames(df) <- c('id','value','v1','v2','v3')
R> df
id value v1 v2 v3
1 1 351 0 1 0
2 2 585 0 1 1
3 3 321 0 0 1
4 4 964 NA NA NA
5 5 556 0 1 0
请注意,
df[4,]
对于c('v1','v2','v3')仍然有NA。
使用dplyr
,您可以尝试:
cols <- c("v1", "v2", "v3")
df %>%
mutate(row_na = rowSums(is.na(select(., one_of(cols)))) == length(cols)) %>%
mutate_at(vars(one_of(cols)), ~ ifelse(!row_na, replace(., is.na(.), 0), .)) %>%
select(-row_na)
id value v1 v2 v3
1 1 351 0 1 0
2 2 585 0 1 1
3 3 321 0 0 1
4 4 964 NA NA NA
5 5 556 0 1 0
cols%
mutate(row_na=rowSums(is.na(select(,一个cols)))==length(cols))%>%
在(vars(cols)中的一个)处突变,~ifelse(!row_na,replace(,is.na(,0),)%>%
选择(-row\u na)
id值v1 v2 v3
1 1 351 0 1 0
2 2 585 0 1 1
3 3 321 0 0 1
44964NA NA NA
5 5 556 0 1 0
在base R中,有一种方法
#columns to check for NA
cols <- c("v1", "v2", "v3")
#rows which needs to be replaced
rows <- which(rowSums(is.na(df[cols])) != length(cols))
#Replace values which are NA to 0
df[rows, cols] <- replace(df[rows, cols], is.na(df[rows, cols]), 0)
df
# id value v1 v2 v3
#1 1 351 0 1 0
#2 2 585 0 1 1
#3 3 321 0 0 1
#4 4 964 NA NA NA
#5 5 556 0 1 0
#要检查NA的列
cols这是一个具有良好旧循环的解决方案:
for (r in 1:nrow(df))
{
# check that not the all row is na but that there are some na
if(!all(is.na(df[r,3:5])) && sum(is.na(df[r,3:5]>0)))
{
df[r,which(is.na(df[r,3:5]))+2]=0
}
}
一个简单的dplyr
解决方案:
library(tidyverse)
df %>%
mutate_at(vars(v1:v3), ~ifelse(is.na(v1) & is.na(v2) & is.na(v3), NA, replace_na(., 0)))
尽可能简单:
df[ !(is.na(df$v1) & is.na(df$v2) & is.na(df$v3)) & is.na(df) ] <- 0
df[!(is.na(df$v1)&is.na(df$v2)&is.na(df$v3))&is.na(df)]它是如何用零替换NAs的?糟糕,打字错误。编辑!