Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/71.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R如何在数据帧中按字符串(文本分级)分组?_R - Fatal编程技术网

R如何在数据帧中按字符串(文本分级)分组?

R如何在数据帧中按字符串(文本分级)分组?,r,R,我有一个像这样的df Name Term Grade David Spring A Mike Spring B Sherry Fall A+ Paul Fall D Joy Fall C Ken Spring B+ 我想按年级列分组,看看有多少学生有A、B、C等 我正在使用 grading = c("A", "B", "C", "D") grading_agg = sapply(grading,

我有一个像这样的df

Name      Term    Grade
David     Spring  A
Mike      Spring  B
Sherry    Fall    A+
Paul      Fall    D
Joy       Fall    C
Ken       Spring  B+
我想按年级列分组,看看有多少学生有A、B、C等

我正在使用

grading = c("A", "B", "C", "D")
grading_agg = sapply(grading, function(x) {
    sum(grepl(x, df$Grade))
})
这让我回过神来

A  B  C  D   
2  2  1  1  
我想知道春天和秋天分别有多少个A,B,C,D。我期待着像这样的事情

       Grade  A   B   C   D
Term  
Spring        1   2   0   0
Fall          1   0   1   1

我正在尝试聚合函数,但它并没有像我预期的那样工作。我错过了什么

我们可以使用
+/-
从带有

table(transform(df1, Grade = sub("[+-]", "", Grade))[-1])
#        Grade
#Term     A B C D
#  Fall   1 0 1 1
#  Spring 1 2 0 0

或者使用
tidyverse
,我们得到了'Term'的
计数和'Grade'的子串,
扩展到'wide'格式

library(tidyverse)
df1 %>% 
  count(Term, Grade = str_remove(Grade, "[+-]")) %>% 
  spread(Grade, n, fill = 0)
# A tibble: 2 x 5
#  Term       A     B     C     D
#  <chr>  <dbl> <dbl> <dbl> <dbl>
#1 Fall       1     0     1     1
#2 Spring     1     2     0     0
库(tidyverse)
df1%>%
计数(术语,等级=str_删除(等级,[+-]”)%>%
排列(坡度,n,填充=0)
#一个tibble:2x5
#术语A、B、C、D
#       
#1下降10 1
#2弹簧1 2 0 0
数据
df1表(transform())的工作方式很有魅力。非常感谢@Kenneth您可以通过单击答案框左侧的
勾选
符号来接受答案,如果它对您有效。谢谢
df1 <- structure(list(Name = c("David", "Mike", "Sherry", "Paul", "Joy", 
"Ken"), Term = c("Spring", "Spring", "Fall", "Fall", "Fall", 
"Spring"), Grade = c("A", "B", "A+", "D", "C", "B+")), .Names = c("Name", 
"Term", "Grade"), class = "data.frame", row.names = c(NA, -6L
))