R 在data.table中按季度聚合信息，将新名称设置为在中使用的列_R_Data.table_Aggregate

R 在data.table中按季度聚合信息，将新名称设置为在中使用的列

R 在data.table中按季度聚合信息，将新名称设置为在中使用的列,r,data.table,aggregate,R,Data.table,Aggregate,我有一个data.table，它是一个较大的表的聚合结果： data.table(Period = c('2018.01', '2018.02'), sales = c(8850, 7950), qty = c(650, 650)) Period sales qty 1: 2018.01 8850 650 2: 2018.02 7950 650 我需要实现的是按季度汇总信息，但无法实现，因此结果将是： data.table(Period = c('2018.01', '2018.

我有一个data.table，它是一个较大的表的聚合结果：

data.table(Period = c('2018.01', '2018.02'), sales = c(8850, 7950), qty = c(650, 650))

    Period sales qty
1: 2018.01  8850 650
2: 2018.02  7950 650

我需要实现的是按季度汇总信息，但无法实现，因此结果将是：

data.table(Period = c('2018.01', '2018.02', '2018Q1', '2018'), sales = c(8850, 7950, 16800, 16800), qty = c(650, 650, 1300, 1300))

   Period sales  qty
1: 2018.01  8850  650
2: 2018.02  7950  650
3:  2018Q1 16800 1300
4:    2018 16800 1300

我试过：

dt=rbind（dt，dt[，lappy（.SD，sum），by=（Period），.SDcols=c（'sales'，'quaty'））

但我得到了重复的列：

    Period  ums men
1: 2018.01 8850 650
2: 2018.02 7950 650
3: 2018.01 8850 650
4: 2018.02 7950 650

此外，我需要将季度的周期单元重命名为Q1（Q2、Q3、Q4），而总的周期单元仅重命名为年度。怎么可能呢

编辑

虽然被接受的答案是正确的，但我已经对其进行了修改，这样我就不需要添加额外的列，也不需要安装新的库：

DT = data.table(Period = c('2018.01', '2018.02'), sales = c(8850, 7950), qty = c(650, 650))

DT$Period = as.double(str_replace(DT$Period, "\\.", ""))
ints      = setInterval(2018)
dt        = DT[, lapply(.SD, sum), by = .(Period = cut(Period, breaks = ints$i, labels = ints$q)), .SDcols = c('sales', 'qty')]
dt        = rbind(dt, dt[Period %in% ints$q, lapply(.SD, sum), by = .(Period = '2018'), .SDcols = c('sales', 'qty')], fill = T)
DT$Period = paste(substr(DT$Period, 1, 4), ".", right(DT$Period, 2), sep = "")
DT        = rbind(DT, dt)

我需要创建这个辅助函数：

setInterval = function (year) {
   y = year * 100
   return (list(
      i = c(y, y + 3, y + 6, y + 9, y + 12),
      q = paste(year, '.', c('Q1', 'Q2', 'Q3', 'Q4'), sep = '')
   ))
}

dtdt使用lubridate
和dplyr
的类似但不同的方法：
将期间
转换为日期
格式。我喜欢使用lubridate:：parse\u date\u time
。请注意，我还为每个年度
和季度
创建了新列：
library(lubridate)
df <- df %>% 
      mutate(Period = parse_date_time(Period, "ym")) %>%
      mutate(Year = year(Period)) %>% 
      mutate(Quarter = quarter(Period))

最后，使用full\u join
组合所有数据：
final <- full_join(Yearly, Quarterly, by=c("Year")) %>% 
         full_join(., df, by=c("Year","Quarter"))

使用lubridate
和dplyr
的类似但不同的方法：
将期间
转换为日期
格式。我喜欢使用lubridate:：parse\u date\u time

。请注意，我还为每个

年度

和

季度

创建了新列：

library(lubridate)
df <- df %>% 
      mutate(Period = parse_date_time(Period, "ym")) %>%
      mutate(Year = year(Period)) %>% 
      mutate(Quarter = quarter(Period))

最后，使用

full\u join

组合所有数据：

final <- full_join(Yearly, Quarterly, by=c("Year")) %>% 
         full_join(., df, by=c("Year","Quarter"))

请注意，行设置周期yq的RHS可以是：

格式（as.yearqtr（dt$Period，“%Y.%m”）

请注意，行设置周期yq的RHS可以是：

格式（as.yearqtr（dt$Period，“%Y.%m”）

   Year Y.sales Y.qty Quarter Q.sales Q.qty     Period sales   qty
  <dbl>   <dbl> <dbl>   <int>   <dbl> <dbl>     <dttm> <dbl> <dbl>
1  2018   16800  1300       1   16800  1300 2018-01-01  8850   650
2  2018   16800  1300       1   16800  1300 2018-02-01  7950   650