R 子组上的新列和另一列中的百分比范围

R 子组上的新列和另一列中的百分比范围,r,dataframe,R,Dataframe,我有一个示例df,如下所示: df_test<- data.frame("Group.Name"=c("Group1","Group2","Group1","Group2","Group2","Group2","Group1"), "Sub_group_name"=c("A","A","B","C","D","E","C"), "Total%"=c(35,26,10,9,5,11,13)) 我们可以使用cut来创建带有相应中

我有一个示例df,如下所示:

df_test<- data.frame("Group.Name"=c("Group1","Group2","Group1","Group2","Group2","Group2","Group1"),
                "Sub_group_name"=c("A","A","B","C","D","E","C"),
                "Total%"=c(35,26,10,9,5,11,13))

我们可以使用
cut
来创建带有相应
中断的
标签
,然后用相应的“Sub\u Group\u Name”替换每个“Group.Name”中最高的“Total”

library(dplyr)
df_test %>% 
  group_by(Group.Name) %>%
  mutate(category = as.character(cut(`Total%`, breaks = c(-Inf,10, 30, Inf), 
          labels = c("New_Group2", "New_Group1", "Other"), right = FALSE)), 
         category = case_when(`Total%` == max(`Total%`) ~ 
                          Sub_group_name,
                                   TRUE ~ category))
# A tibble: 7 x 4
# Groups:   Group.Name [2]
#  Group.Name Sub_group_name `Total%` category  
#  <chr>      <chr>             <dbl> <chr>     
#1 Group1     A                    35 A         
#2 Group2     A                    26 A         
#3 Group1     B                    10 New_Group1
#4 Group2     C                     9 New_Group2
#5 Group2     D                     5 New_Group2
#6 Group2     E                    11 New_Group1
#7 Group1     C                    13 New_Group1
库(dplyr)
df_测试%>%
分组人(组名称)%>%
mutate(category=as.character(cut(`Total%`,breaks=c(-Inf,10,30,Inf)),
labels=c(“新组2”、“新组1”、“其他”),right=FALSE),
当(`Total%`==max(`Total%`)~
子组名称,
真~类别)
#一个tibble:7x4
#组:组名称[2]
#Group.Name Sub_Group_Name`Total%`类别
#                           
#1组1 A 35 A
#2组2 A 26 A
#3组1 B 10新组1
#4组2 C 9新组2
#5组2 D 5新组2
#6组2 E 11新组1
#7组1 C 13新组1
数据
df\u test在您的示例中,
Total%
是一个
因子
列。我认为它应该是数值的。
?在
df\u输出
中,其中一个值在'df\u测试'中更改为'Total'中的12,即
5
。此外,请检查'df\u输出'中的'catgory'。根据条件,一些值应该在“New_Group1”中,实际上“10”在NewGroup2中,因为当我说10-30时,我的意思是不包括10和30。但在看到解决方案后,我知道了如何削减……因此,如果您能提供帮助,这里没有问题问了一个扩展问题:
library(dplyr)
df_test %>% 
  group_by(Group.Name) %>%
  mutate(category = as.character(cut(`Total%`, breaks = c(-Inf,10, 30, Inf), 
          labels = c("New_Group2", "New_Group1", "Other"), right = FALSE)), 
         category = case_when(`Total%` == max(`Total%`) ~ 
                          Sub_group_name,
                                   TRUE ~ category))
# A tibble: 7 x 4
# Groups:   Group.Name [2]
#  Group.Name Sub_group_name `Total%` category  
#  <chr>      <chr>             <dbl> <chr>     
#1 Group1     A                    35 A         
#2 Group2     A                    26 A         
#3 Group1     B                    10 New_Group1
#4 Group2     C                     9 New_Group2
#5 Group2     D                     5 New_Group2
#6 Group2     E                    11 New_Group1
#7 Group1     C                    13 New_Group1
df_test<- data.frame("Group.Name"=c("Group1","Group2","Group1","Group2","Group2",
        "Group2","Group1"),
             "Sub_group_name"=c("A","A","B","C","D","E","C"),
          "Total%"=c(35,26,10,9,5,11,13), stringsAsFactors = FALSE, 
              check.names = FALSE)