在R中按组合并数据
我构造以下data.frame对象:在R中按组合并数据,r,merge,aggregate,R,Merge,Aggregate,我构造以下data.frame对象: name <- c("Homer", "Marge", "Bart", "Lisa", "Maggie") incidents <- c(133, 36, 1242, 2, NA) gender <- c("MALE", "FEMALE", "MALE", "FEMALE", "FEMALE") data <- data.frame(name, incidents, gender) 首先,我使用 clean_data <- d
name <- c("Homer", "Marge", "Bart", "Lisa", "Maggie")
incidents <- c(133, 36, 1242, 2, NA)
gender <- c("MALE", "FEMALE", "MALE", "FEMALE", "FEMALE")
data <- data.frame(name, incidents, gender)
首先,我使用
clean_data <- data[!is.na(incidents), ]
现在,我按性别与
agg <- aggregate(incidents ~ gender, clean_data, mean)
现在,我想用agg的数据填充事件中的NA值,这样数据=
name incidents gender
1 Homer 133 MALE
2 Marge 36 FEMALE
3 Bart 1242 MALE
4 Lisa 2 FEMALE
5 Maggie 19.0 FEMALE
使用base R最简单的方法是什么?您可以使用ave。它以与原始数据集中相同的顺序提供平均值VAL,检查事件列中的NA元素,并用相应NA元素的VAL替换这些元素
vals <- with(data, ave(incidents, gender, FUN= function(x)
mean(x, na.rm=TRUE)))
indx1 <- is.na(data$incidents)
data$incidents[indx1] <- vals[indx1]
如@MrFlick在评论中所示的较短版本。使用ifelse,它将NA元素替换为平均值
data$incidents<-with(data, ave(incidents, gender,
FUN=function(x) ifelse(is.na(x), mean(x, na.rm=T), x)))
代替ifelse,replace也可以用data.table显示为@Ananda Mahto 对于多样性,这里有一种使用data.table的方法,它还演示了replace函数
library(data.table)
as.data.table(data)[
, incidents := replace(incidents, is.na(incidents),
mean(incidents, na.rm = TRUE)),
by = gender][]
# name incidents gender
# 1: Homer 133 MALE
# 2: Marge 36 FEMALE
# 3: Bart 1242 MALE
# 4: Lisa 2 FEMALE
# 5: Maggie 19 FEMALE
我也有同样的想法,但没有数据$incidents@MrFlick看起来好多了。您可以将其作为新答案发布。事实上,我在做ave之前没有看过结果。老实说,我认为在完整的data.frame上使用ave是这里的秘密。我的回答显得多余。那么完整的解决方案应该是什么呢?我是否执行了太多步骤?@GeoffLittle MrFlick的版本是最短的。
vals <- with(data, ave(incidents, gender, FUN= function(x)
mean(x, na.rm=TRUE)))
indx1 <- is.na(data$incidents)
data$incidents[indx1] <- vals[indx1]
data$incidents<-with(data, ave(incidents, gender,
FUN=function(x) ifelse(is.na(x), mean(x, na.rm=T), x)))
library(data.table)
as.data.table(data)[
, incidents := replace(incidents, is.na(incidents),
mean(incidents, na.rm = TRUE)),
by = gender][]
# name incidents gender
# 1: Homer 133 MALE
# 2: Marge 36 FEMALE
# 3: Bart 1242 MALE
# 4: Lisa 2 FEMALE
# 5: Maggie 19 FEMALE