R 如何使用Lappy将值重新分配给data.table的现有列?
我想用该列的中值更新数值列中的NAsR 如何使用Lappy将值重新分配给data.table的现有列?,r,data.table,lapply,R,Data.table,Lapply,我想用该列的中值更新数值列中的NAs dt <- data.table( name = c("A","B","C","D","E"), sex = c("M","F",NA,"F","M"), age = c(1,2,3,NA,4), height = c(178.1, 162.1, NA, 169.5, 172.3) ) 对每个num.cols使用lappy dt[,lapply(.SD, function(value) ifelse(is.na(value), me
dt <- data.table(
name = c("A","B","C","D","E"),
sex = c("M","F",NA,"F","M"),
age = c(1,2,3,NA,4),
height = c(178.1, 162.1, NA, 169.5, 172.3)
)
对每个num.cols使用lappy
dt[,lapply(.SD, function(value)
ifelse(is.na(value), median(value, na.rm=TRUE), value)),
.SDcols = num.cols]
问题,我无法解决如何用data.table语法中的插补中间值向量覆盖NA向量 我们可以使用
zoo
中的na.aggregate
并将FUN
指定为中值
,以中值
为中指定的选定列计算缺失值,并将(:=/code>)值分配给相关列
library(zoo)
dt[, (num.cols) := na.aggregate(.SD, FUN = median),.SDcols = num.cols]
dt
# name sex age height
#1: A M 1.0 178.1
#2: B F 2.0 162.1
#3: C NA 3.0 170.9
#4: D F 2.5 169.5
#5: E M 4.0 172.3
就快到了,要覆盖所需的列,只需在lappy
@MikeH之前使用(num.cols):=
。谢谢,我不知道我是怎么错过的。
dt[,lapply(.SD, function(value)
ifelse(is.na(value), median(value, na.rm=TRUE), value)),
.SDcols = num.cols]
library(zoo)
dt[, (num.cols) := na.aggregate(.SD, FUN = median),.SDcols = num.cols]
dt
# name sex age height
#1: A M 1.0 178.1
#2: B F 2.0 162.1
#3: C NA 3.0 170.9
#4: D F 2.5 169.5
#5: E M 4.0 172.3