R 在一个df中按组类型求和多个变量,无需子集

R 在一个df中按组类型求和多个变量,无需子集,r,R,我正在寻找一种按组类型进行总结的更快方法,对于一个df中的许多不同组,无需子集。下面是一个示例数据帧和我用来完成它的当前代码。我觉得这似乎很冗长,我想有一个更快的方法来解决这个问题。在本例中,我的代码汇总了按名称分组的医疗收入,然后将其合并回主数据。我想总结一下健康和愿景变量,按名称分组。关键是,当变量中有1时,我只想要健康和愿景的收入。谢谢你的帮助 #df name = c("jerry","jerry","jerry","dave","dave","dave","mary","mary","

我正在寻找一种按组类型进行总结的更快方法,对于一个df中的许多不同组,无需子集。下面是一个示例数据帧和我用来完成它的当前代码。我觉得这似乎很冗长,我想有一个更快的方法来解决这个问题。在本例中,我的代码汇总了按名称分组的医疗收入,然后将其合并回主数据。我想总结一下健康和愿景变量,按名称分组。关键是,当变量中有1时,我只想要健康和愿景的收入。谢谢你的帮助

#df
name = c("jerry","jerry","jerry","dave","dave","dave","mary","mary","mary") 
health = c(1,0,1,1,0,1,0,1,1) 
vision = c(0,1,0,0,1,0,1,0,0) 
rev =c(100,200,500,1000,800,300,400,600,300)
df = data.frame(name,health,vision,rev) 


#Subset health
health = subset(df, health == 1) 


#Sum by group type
library(dplyr)
health <- health %>% group_by(name) %>% 
  mutate(
    health_rev=sum(rev, na.rm = TRUE))


#Select variables
health <- health[c("name","health_rev")]


#Remove duplicates
health <- health[!duplicated(health$name), ]


#Merge back to master
master <- merge(x = df, y = health, by = "name", all.x = TRUE)
#df
name=c(“杰瑞”、“杰瑞”、“杰瑞”、“戴夫”、“戴夫”、“玛丽”、“玛丽”、“玛丽”)
健康=c(1,0,1,1,0,1,0,1,1)
视觉=c(0,1,0,0,1,0,1,0,0)
rev=c(1002005001000800300400600300)
df=数据帧(名称、健康状况、视力、版本)
#亚健康
运行状况=子集(df,运行状况==1)
#按组类型求和
图书馆(dplyr)
运行状况%group_by(名称)%%>%
变异(
健康状况(修订=总和(修订,na.rm=真实))
#选择变量
健康像这样的事情

df %>% 
  group_by(name) %>% 
  mutate(health_rev = sum(rev[as.logical(health)]), 
          vision_rev = sum(rev[as.logical(vision)])) %>% 
  ungroup()
结果:

# A tibble: 9 × 6
   name health_rev vision_rev health vision   rev
  <chr>      <dbl>      <dbl>  <dbl>  <dbl> <dbl>
1  dave       1300        800      1      0  1000
2  dave       1300        800      0      1   800
3  dave       1300        800      1      0   300
4 jerry        600        200      1      0   100
5 jerry        600        200      0      1   200
6 jerry        600        200      1      0   500
7  mary        900        400      0      1   400
8  mary        900        400      1      0   600
9  mary        900        400      1      0   300
#一个tible:9×6
名称health_rev vision_rev health vision rev
1戴夫1300 800 1 0 1000
2戴夫1300 800 01 800
3戴夫1300 800 1 0 300
4杰里600 200 100
5杰里600 200 01 200
6杰里600 200 1 0 500
7玛丽900 400 01 400
8玛丽900 400 100 600
9玛丽900 400 100 300

有时将数据重塑为long会使您的数据更易于使用:
库(tidyverse);df%%>%聚集(变量、变量、健康、视野)%%>%过滤(如逻辑(变量))%%>%group_by(名称、变量)%%>%汇总(rev=sum(rev))%%>%spread(变量、变量)
比我的方法优雅得多。谢谢你,克丽丝。