R 是否有一种方法可以根据as.Date变量在一年内求和所有值

R 是否有一种方法可以根据as.Date变量在一年内求和所有值,r,R,将df设置为: ID Status Created_Date Booking_Date Price_Booking 1 Confirmed "2013-03-01" "2013-08-21" 400 1 Confirmed "2013-03-01" "2013-10-01" 350 2 Confirmed "2013-04-11" "2013-10-01" 299 2 Confirmed "2013-04-11" "

将df设置为:

  ID  Status     Created_Date  Booking_Date   Price_Booking
  1   Confirmed  "2013-03-01"  "2013-08-21"   400
  1   Confirmed  "2013-03-01"  "2013-10-01"   350
  2   Confirmed  "2013-04-11"  "2013-10-01"   299
  2   Confirmed  "2013-04-11"  "2013-10-01"   178
  3   Cancelled  "2013-02-21"  "2014-04-01"   99
  4   Confirmed  "2013-08-30"  "2013-10-01"   525
  5   Confirmed  "2014-01-01"  "2014-12-01"   439
  6   Confirmed  "2015-02-22"  "2015-11-18"   200
  6   Confirmed  "2015-07-13"  "2017-04-09"   100
希望根据创建的日期变量计算第一年内每个客户的收入

我试过:

 with(df$ID[df$Status=="Confirmed" & format(as.Date(df$Created_Date), "%Y") == 2013 & format(as.Date(df$Booking_Date), "%Y") == 2013]))
但是,这只计算每个日历年的收入,我希望它与创建日期相关

预期产出将是:

   ID    Sum_Price_Booking
   1     750
   2     477
   3     NA
   4     525
   5     439
   6     200

您可以使用by=中的data.table方法,您可以选择聚合

library(data.table)
library(lubridate)

    dt <- data.table(
      ID = c(1, 1, 2, 2, 3, 4, 5, 6, 6),
      Status = c(
        'Confirmed',
        'Confirmed',
        'Confirmed',
        'Confirmed',
        'Cancelled',
        'Confirmed',
        'Confirmed',
        'Confirmed',
        'Confirmed'
      ),
      Created_Date = as.Date(
        c(
          "2013-03-01",
          "2013-03-01",
          "2013-04-11",
          "2013-04-11",
          "2013-02-21",
          "2013-08-30",
          "2014-01-01",
          "2015-02-22",
          "2015-07-13"
        )
      ),
      Booking_Date = as.Date(
        c(
          "2013-08-21",
          "2013-10-01",
          "2013-10-01",
          "2013-10-01",
          "2014-04-01",
          "2013-10-01",
          "2014-12-01",
          "2015-11-18",
          "2017-04-09"
        )
      ),
      Price_Booking = c(400,
                        350,
                        299,
                        178,
                        99,
                        525,
                        439,
                        200,
                        100)
    )



    dt[Status == 'Confirmed', .(price_sum = sum(Price_Booking)), by = .(Year = year(Created_Date), ID)]
库(data.table)
图书馆(lubridate)

dt对于那些在
预订日期
创建日期
之间差异小于1年的值,我们可以
根据
ID
总和
对这些值进行分组

library(dplyr)
df %>%
  mutate_at(vars(ends_with("Date")), as.Date) %>%
  group_by(ID) %>%
  summarise(sum = sum(Price_Booking[Booking_Date - Created_Date < 365]))

#     ID   sum
#  <int> <int>
#1     1   750
#2     2   477
#3     3     0
#4     4   525
#5     5   439
#6     6   200
库(dplyr)
df%>%
在(变量(以“日期”结尾)、as.Date%%>处进行变异
分组依据(ID)%>%
总结(总结=总结(价格预订[预订日期-创建日期<365]))
#身份证金额
#   
#1     1   750
#2     2   477
#3     3     0
#4     4   525
#5     5   439
#6     6   200
数据

df <- structure(list(ID = c(1L, 1L, 2L, 2L, 3L, 4L, 5L, 6L, 6L), 
Status = structure(c(2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L), 
.Label = c("Cancelled", "Confirmed"), class = "factor"), 
Created_Date = structure(c(2L, 2L, 3L, 3L, 1L, 4L, 5L, 6L, 7L), 
.Label = c("2013-02-21", "2013-03-01", "2013-04-11", "2013-08-30", "2014-01-01", 
"2015-02-22", "2015-07-13"), class = "factor"), Booking_Date = 
structure(c(1L, 2L, 2L, 2L, 3L, 2L, 4L, 5L, 6L), 
.Label = c("2013-08-21", "2013-10-01", "2014-04-01", "2014-12-01", "2015-11-18", 
"2017-04-09"), class = "factor"), Price_Booking = c(400L, 350L, 299L, 178L, 99L, 
525L, 439L,200L, 100L)), class = "data.frame", row.names = c(NA, -9L))

df请不要使用
Rstudio
标记,除非问题明确存在于该特定IDE