R-根据开始和结束日期序列复制行

R-根据开始和结束日期序列复制行,r,R,我有这样一个数据帧“DF”: Flight.Start Flight.End Device Partner Creative Days.in.Flight 2015-08-31 2015-08-31 Standard MSN Video 35 我需要做的是像这样“炸掉它”: 等等。。。。。。直到日期变量到达2015-10-04,然后继续下一次复制 基本上,每一行都会被flight-1中的天数复制(因为已经存在的行可以解释间隔中

我有这样一个数据帧“DF”:

Flight.Start   Flight.End   Device      Partner   Creative   Days.in.Flight 
2015-08-31     2015-08-31   Standard    MSN       Video      35
我需要做的是像这样“炸掉它”:

等等。。。。。。直到日期变量到达2015-10-04,然后继续下一次复制

基本上,每一行都会被flight-1中的天数复制(因为已经存在的行可以解释间隔中的一天,然后是一个新列“Date”)为该航班内的相关日期填写。因此,如果一行的开始日期和结束日期分别为9/1和9/5,则将在现有行的基础上追加4个重复行,并创建一个新列(日期),并且原始行的航班开始和结束日期的日期序列将填充列值

所有日期值的格式均为日期,飞行天数为num,其余为因子

编辑

针对重复的问题标记:


澄清一下,这与被标记为重复的情况不同,因为我的问题并不是真正关注如何根据飞行天数进行复制(我已经知道如何做到!),而是如何在输出数据框中添加列,并在相应的飞行周期内按顺序插入日期。感谢您的提醒…

这里有一种使用
splitstackshape
dplyr
的方法。使用
expandRows()
splitstackshape
包中,您可以按照所述扩展数据框。然后,您想使用
mutate()
添加一系列日期。我所做的是按照
Flight.Start
Flight.End
的组合对数据进行分组,然后使用
seq()
为每个组创建一个日期序列。
first()
使用
Flight.Start
Flight.End
的第一个元素。这样,您就可以创建所需的序列。我希望这对您有所帮助

数据和代码

mydf <- data.frame(Flight.Start = as.Date(c("2015-09-01", "2015-09-10")),
                   Flight.End = as.Date(c("2015-09-03", "2015-09-15")),
                   Device = "Standard",
                   Creative = "Video",
                   Days.in.Flight = c(3, 6),
                   stringsAsFactors = FALSE)

#  Flight.Start Flight.End   Device Creative Days.in.Flight
#1   2015-09-01 2015-09-03 Standard    Video              3
#2   2015-09-10 2015-09-15 Standard    Video              6

library(splitstackshape)
library(dplyr)

expandRows(mydf, "Days.in.Flight", drop = FALSE) %>%
group_by(Flight.Start, Flight.End) %>%
mutate(Date = seq(first(Flight.Start),
                  first(Flight.End),
                  by = 1))

#  Flight.Start Flight.End   Device Creative Days.in.Flight       Date
#        (date)     (date)    (chr)    (chr)          (dbl)     (date)
#1   2015-09-01 2015-09-03 Standard    Video              3 2015-09-01
#2   2015-09-01 2015-09-03 Standard    Video              3 2015-09-02
#3   2015-09-01 2015-09-03 Standard    Video              3 2015-09-03
#4   2015-09-10 2015-09-15 Standard    Video              6 2015-09-10
#5   2015-09-10 2015-09-15 Standard    Video              6 2015-09-11
#6   2015-09-10 2015-09-15 Standard    Video              6 2015-09-12
#7   2015-09-10 2015-09-15 Standard    Video              6 2015-09-13
#8   2015-09-10 2015-09-15 Standard    Video              6 2015-09-14
#9   2015-09-10 2015-09-15 Standard    Video              6 2015-09-15
mydf%
分组依据(航班开始、航班结束)%>%
变异(日期=序号(第一次(航班开始),
第一次(飞行结束),
by=1)
#航班。开始航班。结束设备创建日期。航班日期
#(日期)(日期)(chr)(chr)(dbl)(日期)
#1 2015-09-01 2015-09-03标准视频3 2015-09-01
#2 2015-09-01 2015-09-03标准视频3 2015-09-02
#3 2015-09-01 2015-09-03标准视频3 2015-09-03
#4 2015-09-10 2015-09-15标准视频6 2015-09-10
#5 2015-09-10 2015-09-15标准视频6 2015-09-11
#6 2015-09-10 2015-09-15标准视频6 2015-09-12
#7 2015-09-10 2015-09-15标准视频6 2015-09-13
#8 2015-09-10 2015-09-15标准视频6 2015-09-14
#9 2015-09-10 2015-09-15标准视频6 2015-09-15

下面是一种使用base R的方法:

mydf <- data.frame(Flight.Start = as.Date(c("2015-09-01", "2015-09-10")),
                   Flight.End = as.Date(c("2015-09-03", "2015-09-15")),
                   Device = "Standard",
                   Creative = "Video",
                   Days.in.Flight = c(3, 6),
                   stringsAsFactors = FALSE)

expanded <-mydf[rep(row.names(mydf), mydf$ Days.in.Flight), ]
data.frame(expanded,Date=expanded$Flight.Start+(sequence(mydf$Days.in.Flight)-1))

> data.frame(expanded,Date=expanded$Flight.Start+(sequence(mydf$Days.in.Flight)-1))
    Flight.Start Flight.End   Device Creative Days.in.Flight       Date
1     2015-09-01 2015-09-03 Standard    Video              3 2015-09-01
1.1   2015-09-01 2015-09-03 Standard    Video              3 2015-09-02
1.2   2015-09-01 2015-09-03 Standard    Video              3 2015-09-03
2     2015-09-10 2015-09-15 Standard    Video              6 2015-09-10
2.1   2015-09-10 2015-09-15 Standard    Video              6 2015-09-11
2.2   2015-09-10 2015-09-15 Standard    Video              6 2015-09-12
2.3   2015-09-10 2015-09-15 Standard    Video              6 2015-09-13
2.4   2015-09-10 2015-09-15 Standard    Video              6 2015-09-14
2.5   2015-09-10 2015-09-15 Standard    Video              6 2015-09-15

mydf或使用
data.table
,我们将'data.frame'转换为'data.table'(
setDT(mydf)
),按'Days.in.Flight'复制行序列,基于该索引,我们将数据集(
.SD[rep(…
)子集,按'Flight.Start'和'Flight.End'分组,我们创建'Date'列

library(data.table)
setDT(mydf)[, .SD[rep(1:.N, Days.in.Flight)]][, 
     Date:= seq(Flight.Start , Flight.End, by = '1 day'),
     by = .(Flight.Start, Flight.End)][]

嘿@Jay,绝对不是,谢谢。我可能不应该包括所有关于复制行的内容,因为我知道如何使用
expandRows()
,但这个问题更多的是关于如何填写一个连续的日期列来进行扩展
library(data.table)
setDT(mydf)[, .SD[rep(1:.N, Days.in.Flight)]][, 
     Date:= seq(Flight.Start , Flight.End, by = '1 day'),
     by = .(Flight.Start, Flight.End)][]