R如何在数据帧中按字符串(文本分级)分组?
我有一个像这样的dfR如何在数据帧中按字符串(文本分级)分组?,r,R,我有一个像这样的df Name Term Grade David Spring A Mike Spring B Sherry Fall A+ Paul Fall D Joy Fall C Ken Spring B+ 我想按年级列分组,看看有多少学生有A、B、C等 我正在使用 grading = c("A", "B", "C", "D") grading_agg = sapply(grading,
Name Term Grade
David Spring A
Mike Spring B
Sherry Fall A+
Paul Fall D
Joy Fall C
Ken Spring B+
我想按年级列分组,看看有多少学生有A、B、C等
我正在使用
grading = c("A", "B", "C", "D")
grading_agg = sapply(grading, function(x) {
sum(grepl(x, df$Grade))
})
这让我回过神来
A B C D
2 2 1 1
我想知道春天和秋天分别有多少个A,B,C,D。我期待着像这样的事情
Grade A B C D
Term
Spring 1 2 0 0
Fall 1 0 1 1
我正在尝试聚合函数,但它并没有像我预期的那样工作。我错过了什么 我们可以使用
表
将+/-
从带有子
table(transform(df1, Grade = sub("[+-]", "", Grade))[-1])
# Grade
#Term A B C D
# Fall 1 0 1 1
# Spring 1 2 0 0
或者使用
tidyverse
,我们得到了'Term'的计数和'Grade'的子串,扩展到'wide'格式
library(tidyverse)
df1 %>%
count(Term, Grade = str_remove(Grade, "[+-]")) %>%
spread(Grade, n, fill = 0)
# A tibble: 2 x 5
# Term A B C D
# <chr> <dbl> <dbl> <dbl> <dbl>
#1 Fall 1 0 1 1
#2 Spring 1 2 0 0
库(tidyverse)
df1%>%
计数(术语,等级=str_删除(等级,[+-]”)%>%
排列(坡度,n,填充=0)
#一个tibble:2x5
#术语A、B、C、D
#
#1下降10 1
#2弹簧1 2 0 0
数据
df1表(transform())的工作方式很有魅力。非常感谢@Kenneth您可以通过单击答案框左侧的勾选
符号来接受答案,如果它对您有效。谢谢
df1 <- structure(list(Name = c("David", "Mike", "Sherry", "Paul", "Joy",
"Ken"), Term = c("Spring", "Spring", "Fall", "Fall", "Fall",
"Spring"), Grade = c("A", "B", "A+", "D", "C", "B+")), .Names = c("Name",
"Term", "Grade"), class = "data.frame", row.names = c(NA, -6L
))