R 对数据帧中的每个单元格应用条件替换函数
我试图通过检查每个值是否存在于特定的列表中并保留它来对R中的数据帧进行子集划分。例如,在以下数据帧中:R 对数据帧中的每个单元格应用条件替换函数,r,dataframe,subset,R,Dataframe,Subset,我试图通过检查每个值是否存在于特定的列表中并保留它来对R中的数据帧进行子集划分。例如,在以下数据帧中: x <- data.frame(A = sample(1:5, 5), B = sample(1:5, 5), C = sample(1:5, 5)) A B C 1 2 2 1 2 3 3 3 3 1 4 4 4 4 5 2 5 5 1 5 丢失的值会发生什么情况并不重要——如果更容易的话,可以将它们更改为NA。从浏
x <- data.frame(A = sample(1:5, 5),
B = sample(1:5, 5),
C = sample(1:5, 5))
A B C
1 2 2 1
2 3 3 3
3 1 4 4
4 4 5 2
5 5 1 5
丢失的值会发生什么情况并不重要——如果更容易的话,可以将它们更改为NA。从浏览类似的问题来看,lapply似乎可以做到这一点,但作为新手,我正在努力将我所看到的应用到这个场景中 将每行折叠为匹配的数字,并将每行长度调整为ncol。假设您希望将数字左对齐,如预期输出所示
set.seed(47)
x <- data.frame(A = sample(1:5, 5),
B = sample(1:5, 5),
C = sample(1:5, 5))
# with lapply
keep_vals = c(1, 3, 4)
x[] = lapply(x, function(y) {
y[! y %in% keep_vals] = NA
return(y)
})
x
# A B C
# 1 3 1 1
# 2 1 NA NA
# 3 NA NA 4
# 4 4 3 NA
# 5 NA 4 3
d <- setNames(as.data.frame(t(apply(d, 1, function(x) {
x <- x[x %in% c(1, 3, 4)]
`length<-`(x, ncol(d))
}))), names(d))
d
# A B C
# 1 1 NA NA
# 2 3 3 3
# 3 1 4 4
# 4 4 NA NA
# 5 NA NA NA
使用dplyr::bind_行
set.seed(47) # reset data
x <- data.frame(A = sample(1:5, 5),
B = sample(1:5, 5),
C = sample(1:5, 5))
keep_vals = c(1, 3, 4)
for (i in 1:ncol(x)) {
x[, i][!x[, i] %in% keep_vals] <- NA
}
x
# A B C
# 1 3 1 1
# 2 1 NA NA
# 3 NA NA 4
# 4 4 3 NA
# 5 NA 4 3
x %>% mutate_all(
~replace(., !. %in% keep_vals, NA)
)
# A B C
# 1 3 1 1
# 2 1 NA NA
# 3 NA NA 4
# 4 4 3 NA
# 5 NA 4 3
d <- setNames(as.data.frame(t(apply(d, 1, function(x) {
x <- x[x %in% c(1, 3, 4)]
`length<-`(x, ncol(d))
}))), names(d))
d
# A B C
# 1 1 NA NA
# 2 3 3 3
# 3 1 4 4
# 4 4 NA NA
# 5 NA NA NA
d <- read.table(text="A B C
1 2 2 1
2 3 3 3
3 1 4 4
4 4 5 2
5 5 2 5", header=TRUE)
do.call(bind_rows,apply(x,1, function(a) a[a %in% c(1,3,4)]))
# A tibble: 5 x 3
A B C
<int> <int> <int>
1 4 NA NA
2 1 1 1
3 3 3 NA
4 NA NA 4
5 NA 4 3