R 子组上的新列和另一列中的百分比范围
我有一个示例df,如下所示:R 子组上的新列和另一列中的百分比范围,r,dataframe,R,Dataframe,我有一个示例df,如下所示: df_test<- data.frame("Group.Name"=c("Group1","Group2","Group1","Group2","Group2","Group2","Group1"), "Sub_group_name"=c("A","A","B","C","D","E","C"), "Total%"=c(35,26,10,9,5,11,13)) 我们可以使用cut来创建带有相应中
df_test<- data.frame("Group.Name"=c("Group1","Group2","Group1","Group2","Group2","Group2","Group1"),
"Sub_group_name"=c("A","A","B","C","D","E","C"),
"Total%"=c(35,26,10,9,5,11,13))
我们可以使用
cut
来创建带有相应中断的标签
,然后用相应的“Sub\u Group\u Name”替换每个“Group.Name”中最高的“Total”
library(dplyr)
df_test %>%
group_by(Group.Name) %>%
mutate(category = as.character(cut(`Total%`, breaks = c(-Inf,10, 30, Inf),
labels = c("New_Group2", "New_Group1", "Other"), right = FALSE)),
category = case_when(`Total%` == max(`Total%`) ~
Sub_group_name,
TRUE ~ category))
# A tibble: 7 x 4
# Groups: Group.Name [2]
# Group.Name Sub_group_name `Total%` category
# <chr> <chr> <dbl> <chr>
#1 Group1 A 35 A
#2 Group2 A 26 A
#3 Group1 B 10 New_Group1
#4 Group2 C 9 New_Group2
#5 Group2 D 5 New_Group2
#6 Group2 E 11 New_Group1
#7 Group1 C 13 New_Group1
库(dplyr)
df_测试%>%
分组人(组名称)%>%
mutate(category=as.character(cut(`Total%`,breaks=c(-Inf,10,30,Inf)),
labels=c(“新组2”、“新组1”、“其他”),right=FALSE),
当(`Total%`==max(`Total%`)~
子组名称,
真~类别)
#一个tibble:7x4
#组:组名称[2]
#Group.Name Sub_Group_Name`Total%`类别
#
#1组1 A 35 A
#2组2 A 26 A
#3组1 B 10新组1
#4组2 C 9新组2
#5组2 D 5新组2
#6组2 E 11新组1
#7组1 C 13新组1
数据
df\u test在您的示例中,Total%
是一个因子
列。我认为它应该是数值的。
?在df\u输出
中,其中一个值在'df\u测试'中更改为'Total'中的12,即5
。此外,请检查'df\u输出'中的'catgory'。根据条件,一些值应该在“New_Group1”中,实际上“10”在NewGroup2中,因为当我说10-30时,我的意思是不包括10和30。但在看到解决方案后,我知道了如何削减……因此,如果您能提供帮助,这里没有问题问了一个扩展问题:
library(dplyr)
df_test %>%
group_by(Group.Name) %>%
mutate(category = as.character(cut(`Total%`, breaks = c(-Inf,10, 30, Inf),
labels = c("New_Group2", "New_Group1", "Other"), right = FALSE)),
category = case_when(`Total%` == max(`Total%`) ~
Sub_group_name,
TRUE ~ category))
# A tibble: 7 x 4
# Groups: Group.Name [2]
# Group.Name Sub_group_name `Total%` category
# <chr> <chr> <dbl> <chr>
#1 Group1 A 35 A
#2 Group2 A 26 A
#3 Group1 B 10 New_Group1
#4 Group2 C 9 New_Group2
#5 Group2 D 5 New_Group2
#6 Group2 E 11 New_Group1
#7 Group1 C 13 New_Group1
df_test<- data.frame("Group.Name"=c("Group1","Group2","Group1","Group2","Group2",
"Group2","Group1"),
"Sub_group_name"=c("A","A","B","C","D","E","C"),
"Total%"=c(35,26,10,9,5,11,13), stringsAsFactors = FALSE,
check.names = FALSE)