按年份以R显示累计总和

按年份以R显示累计总和,r,R,我有这样的数据 Month_Yr revenue year mo 2016-01 1200 2016 01 2016-02 7826 2016 02 2016-03 11892 2016 03 2016-05 11376 2016 05 2016-06 9055 2016 06 2016-07 5000 2016 07 我想创建一个列,其中包含每年的累计收入总额,但要按月列出。所以看起来是这样的: Month_Yr revenue year mo c

我有这样的数据

Month_Yr revenue year mo
2016-01    1200  2016 01
2016-02    7826  2016 02
2016-03   11892  2016 03
2016-05   11376  2016 05
2016-06    9055  2016 06
2016-07    5000  2016 07
我想创建一个列,其中包含每年的累计收入总额,但要按月列出。所以看起来是这样的:

Month_Yr revenue year mo cumsum
2016-01    1200  2016 01 1200 
2016-02    7826  2016 02 9026
2016-03   11892  2016 03 20918
2016-05   11376  2016 05 32294
2016-06    9055  2016 06 41349
2016-07    5000  2016 07 46349
该数据持续到2018年,有些月份(如2016年4月)没有任何值,因此被排除在外。谢谢

您可以尝试:

library(dplyr)
df <- data.frame("Month_Yr" = c("2016-01","2016-02","2016-03","2016-05","2016-06","2016-07","2017-01","2017-02","2017-03","2017-05","2017-06","2017-07","2018-01","2018-02","2018-03","2018-05","2018-06","2018-07"), "Revenue" = c(1200,7826,11892,11376,9055,5000))
df$year <- substr(df$Month_Yr,0,4)
df$mo <- substr(df$Month_Yr,6,7)

df <- df %>%
  arrange(year,mo) %>%
  group_by(year) %>%
  mutate(cumsum = cumsum(Revenue))
库(dplyr)
df您可以尝试:

library(dplyr)
df <- data.frame("Month_Yr" = c("2016-01","2016-02","2016-03","2016-05","2016-06","2016-07","2017-01","2017-02","2017-03","2017-05","2017-06","2017-07","2018-01","2018-02","2018-03","2018-05","2018-06","2018-07"), "Revenue" = c(1200,7826,11892,11376,9055,5000))
df$year <- substr(df$Month_Yr,0,4)
df$mo <- substr(df$Month_Yr,6,7)

df <- df %>%
  arrange(year,mo) %>%
  group_by(year) %>%
  mutate(cumsum = cumsum(Revenue))
库(dplyr)
df
在base R中,您可以执行以下操作

transform(df,year=y<-sub("-.*","",Month_Yr),
          month=sub(".*-","",Month_Yr),revenue=ave(Revenue,y,FUN=cumsum))
transform(df,year=y
在base R中,您可以执行以下操作

transform(df,year=y<-sub("-.*","",Month_Yr),
          month=sub(".*-","",Month_Yr),revenue=ave(Revenue,y,FUN=cumsum))

变换(df,year=y这只取一列的总和,不考虑年份变量。你需要在那里的某个地方添加一个分组函数
ave
或dplyr的
groupby
或data.table或其他东西。是的,我在重新阅读问题后意识到了这一点。我相应地更新了答案。谢谢!这只需要一个c的总和列并忽略年份变量。您需要在其中的某个位置添加分组函数
ave
或dplyr的
groupby
或data.table或其他内容。是的,我在重读问题后意识到了这一点。我已相应地更新了答案。谢谢!