R 按多个组对特定行求和

R 按多个组对特定行求和,r,dataframe,group-by,R,Dataframe,Group By,我有一个像下面这样的数据帧 df <- data.frame(row.names = c(1,2,3,4,5,6,7,8), Week = c(1,1,2,2,52,52,53,53), State = c("Florida", "Georgia","Florida", "Georgia","Florida", "Georgia","Florida",

我有一个像下面这样的数据帧

df <- data.frame(row.names = c(1,2,3,4,5,6,7,8), Week = c(1,1,2,2,52,52,53,53), State = c("Florida", "Georgia","Florida", "Georgia","Florida", "Georgia","Florida", "Georgia"), Count_2001 = c(25,16,83,45,100,98,22,34), Count_2002 = c(3, 78, 22, 5, 78, 6, 88, 97))
df2 <- data.frame(row.names = c(1,2,3,4,5,6), Week = c(1,1,2,2,52,52), State = c("Florida", "Georgia","Florida", "Georgia","Florida", "Georgia"), Count_2001 = c(25,16,83,45,122,132), Count_2002 = c(3, 78, 22, 5, 166, 103))

df使用
aggregate

s <- 52:53
tp <- transform(aggregate(cbind(Count_2001, Count_2002) ~ State, df[df$Week %in% s, ], sum),
          Week=52)
df <- merge(df[!df$Week %in% s, ], tp, all=T)
df
#   Week   State Count_2001 Count_2002
# 1    1 Florida         25          3
# 2    1 Georgia         16         78
# 3    2 Florida         83         22
# 4    2 Georgia         45          5
# 5   52 Florida        122        166
# 6   52 Georgia        132        103

s将您的53秒改为52秒,并按组进行求和:

library(dplyr)
df %>%
  mutate(Week = case_when(Week == 53 ~ 52, TRUE ~ Week)) %>%
  group_by(State, Week) %>%
  summarize(across(everything(), sum))
# # A tibble: 6 x 4
# # Groups:   State [2]
#   State    Week Count_2001 Count_2002
#   <chr>   <dbl>      <dbl>      <dbl>
# 1 Florida     1         25          3
# 2 Florida     2         83         22
# 3 Florida    52        122        166
# 4 Georgia     1         16         78
# 5 Georgia     2         45          5
# 6 Georgia    52        132        103
库(dplyr)
df%>%
变异(当(周==53~52,真~周))%>%
按(州、周)分组%>%
总结(跨越(所有内容(),总和))
##tibble:6 x 4
##团体:州[2]
#州周计数2001计数2002
#                  
#佛罗里达州1253
#佛罗里达州2 83 22
#佛罗里达州3 52 122 166
#4格鲁吉亚11678
#5格鲁吉亚2455
#6格鲁吉亚52 132 103

使用任何特定于州的方法的一个简单替代方法就是在聚合级别创建一个新列,其中包含有效的周数

我会这样做:(使用tidyverse库)

df%
变异(第一周=如果其他(第52,53,52周中的第%周)
然后你可以总结为
dfsumm%
分组依据(州,1周)%>%
总结()

我只需要确保输入的
是一个双向量,而不是一个整数向量,这个解决方案工作得很好!谢谢你,嗯,我很惊讶这有什么不同。很高兴它能工作!
df <- df %>%
    mutate(week1 = if_else(week %in% c(52,53),52,week)


and then you can summate as 

dfsumm <- df %>%
    group_by(state, week1)%>%
    summarise()