如何在dplyr中进行条件分组和汇总
在r中,如何根据要视为状态列的Sum列的最大值合并数据帧中的行,同时将其他行汇总为Sum 因此,对于如下输入:如何在dplyr中进行条件分组和汇总,r,dplyr,R,Dplyr,在r中,如何根据要视为状态列的Sum列的最大值合并数据帧中的行,同时将其他行汇总为Sum 因此,对于如下输入: score1 score2 score3 sum Status John 1 1 0 2 A John 0 3 0 3 B Smith 0 1 3 4 A Sean 1 2 1 4 A Sean 1 0 2
score1 score2 score3 sum Status
John 1 1 0 2 A
John 0 3 0 3 B
Smith 0 1 3 4 A
Sean 1 2 1 4 A
Sean 1 0 2 3 B
Sean 5 1 1 7 C
Carl 0 1 1 2 A
我希望有这样的输出:
Name score1 score2 score3 sum Status
John 1 4 0 5 B
Smith 0 1 3 4 A
Sean 7 3 4 14 C
Carl 0 1 1 2 A
我们可以计算
总和
,并获得每个名称
的max总和
对应的状态
library(dplyr)
df %>%
group_by(Name) %>%
summarise(Sum = sum(sum), Status = Status[which.max(sum)])
# Name Sum Status
# <fct> <int> <fct>
#1 Carl 2 A
#2 John 5 B
#3 Sean 14 C
#4 Smith 4 A
数据
df <- structure(list(Name = structure(c(2L, 2L, 4L, 3L, 3L, 3L, 1L),
.Label = c("Carl","John", "Sean", "Smith"), class = "factor"), score1 = c(1L, 0L,
0L, 1L, 1L, 5L, 0L), score2 = c(1L, 3L, 1L, 2L, 0L, 1L, 1L),
score3 = c(0L, 0L, 3L, 1L, 2L, 1L, 1L), sum = c(2L, 3L, 4L,
4L, 3L, 7L, 2L), Status = structure(c(1L, 2L, 1L, 1L, 2L,
3L, 1L), .Label = c("A", "B", "C"), class = "factor")), class = "data.frame",
row.names = c(NA, -7L))
df完美。再次感谢。
df <- structure(list(Name = structure(c(2L, 2L, 4L, 3L, 3L, 3L, 1L),
.Label = c("Carl","John", "Sean", "Smith"), class = "factor"), score1 = c(1L, 0L,
0L, 1L, 1L, 5L, 0L), score2 = c(1L, 3L, 1L, 2L, 0L, 1L, 1L),
score3 = c(0L, 0L, 3L, 1L, 2L, 1L, 1L), sum = c(2L, 3L, 4L,
4L, 3L, 7L, 2L), Status = structure(c(1L, 2L, 1L, 1L, 2L,
3L, 1L), .Label = c("A", "B", "C"), class = "factor")), class = "data.frame",
row.names = c(NA, -7L))