如何使用R中的聚合列值创建一个新的dataframe,该列值由另一列分组
我以前使用dplyr成功地添加了聚合列,但我不知道如何创建一个新的宽数据框,其中包含基于另一列(SkillGroup)的值并按另一列(EmployeeID)分组的新聚合列(平均值) 我的原始DF如下所示:如何使用R中的聚合列值创建一个新的dataframe,该列值由另一列分组,r,group-by,dplyr,aggregate,mutate,R,Group By,Dplyr,Aggregate,Mutate,我以前使用dplyr成功地添加了聚合列,但我不知道如何创建一个新的宽数据框,其中包含基于另一列(SkillGroup)的值并按另一列(EmployeeID)分组的新聚合列(平均值) 我的原始DF如下所示: EmployeeID <- c(rep(1,5), rep(2,3)) SkillGroup <- c(rep("A",3), rep("B",2), "A", "B", "C") Proficiency <- c(1,2,3,4,5,1,2,3) mydata <-
EmployeeID <- c(rep(1,5), rep(2,3))
SkillGroup <- c(rep("A",3), rep("B",2), "A", "B", "C")
Proficiency <- c(1,2,3,4,5,1,2,3)
mydata <- data.frame(EmployeeID, SkillGroup, Proficiency)
EmployeeID2 <- c(1,2)
MeanSkillA <- c(2,1)
MeanSkillB <- c(4.5,2)
MeanSkillC <- c(NA, 3)
desiredDF <- data.frame(EmployeeID2, MeanSkillA, MeanSkillB, MeanSkillC)
EmployeeID聚合值,然后tidyr::spread
:
mydata %>%
group_by(EmployeeID, SkillGroup = paste('MeanSkill', SkillGroup, sep="")) %>%
summarise(MeanSkill = mean(Proficiency)) %>%
spread(SkillGroup, MeanSkill)
# A tibble: 2 x 4
# Groups: EmployeeID [2]
# EmployeeID MeanSkillA MeanSkillB MeanSkillC
#* <dbl> <dbl> <dbl> <dbl>
#1 1 2 4.5 NA
#2 2 1 2.0 3
mydata%>%
组员(EmployeeID,SkillGroup=paste('MeanSkill',SkillGroup,sep=“”))%>%
总结(平均技能=平均(熟练程度))%>%
扩散(技能组,平均技能)
#一个tibble:2x4
#组别:EmployeeID[2]
#员工ID MeanSkillA MeanSkillB MeanSkillC
#*
#1 1 2 4.5 NA
#2 2 1 2.0 3