R 匹配两个具有不同列名的数据帧,并使用另一列的平均值创建新列
我有两个数据帧。第一个仅列出每个学校/团队一次,如下所示:R 匹配两个具有不同列名的数据帧,并使用另一列的平均值创建新列,r,join,average,mutate,R,Join,Average,Mutate,我有两个数据帧。第一个仅列出每个学校/团队一次,如下所示: classA <- data.frame(School=c("Omaha South", "Millard North", "Elkhorn")) df2 %>% right_join(df1, by = c('Winner' = 'School')) %>% na.omit() %>% count(Winner, name = "wins") %>% right_join(df1, c('Win
classA <- data.frame(School=c("Omaha South", "Millard North", "Elkhorn"))
df2 %>%
right_join(df1, by = c('Winner' = 'School')) %>%
na.omit() %>%
count(Winner, name = "wins") %>%
right_join(df1, c('Winner' = 'School')) %>%
mutate(wins = replace(wins, is.na(wins), 0))
我们可以用
分数加入A类
,然后取的平均值
。每个学校的分数
library(dplyr)
classA %>%
left_join(scores, by = c('School' = 'Away.Team')) %>%
group_by(School) %>%
summarise(AwayScore = mean(Away.Score, na.rm = TRUE))
# A tibble: 3 x 2
# School AwayScore
# <fct> <dbl>
#1 Elkhorn 60
#2 Millard North 84
#3 Omaha South 60
如果我在classA中已经有了其他列,有没有办法在不删除它们的情况下执行此操作?@JeffSwanson您可以使用mutate
而不是最后一行中的summary
。谢谢,但是mutate似乎在添加来自分数
数据框的一组额外的列和行,我只需要显示一个新列。@JeffSwanson哪一列?然后在分组中使用该选项<代码>分组依据(学校,其他专栏)
@JeffSwanson Try分数%>%group by(Away.Team)%%>%summary(AwayScore=mean(Away.Score,na.rm=TRUE))%%>%right\u加入(classA,by=c('Away.Team'='School')
library(dplyr)
classA %>%
left_join(scores, by = c('School' = 'Away.Team')) %>%
group_by(School) %>%
summarise(AwayScore = mean(Away.Score, na.rm = TRUE))
# A tibble: 3 x 2
# School AwayScore
# <fct> <dbl>
#1 Elkhorn 60
#2 Millard North 84
#3 Omaha South 60
aggregate(Away.Score~School,
merge(classA, scores, by.x = 'School', by.y = 'Away.Team'),
mean, na.rm = TRUE)