R-ddply:汇总时将日期转换为数字
我有一个不同日期的id_客户表。我需要创建一个表,其中包含每个客户端的最大日期和最小日期之间的所有日期。例如,我的表格是:R-ddply:汇总时将日期转换为数字,r,list,date,sequence,plyr,R,List,Date,Sequence,Plyr,我有一个不同日期的id_客户表。我需要创建一个表,其中包含每个客户端的最大日期和最小日期之间的所有日期。例如,我的表格是: tbl<-data.frame(id_cliente=c(1,1,1,1,2,3,3,3), fecha=c('2013-01-01', '2013-06-01','2013-05-01', '2013-04-01', '2013-01-01', '2013-01-01','2013-
tbl<-data.frame(id_cliente=c(1,1,1,1,2,3,3,3),
fecha=c('2013-01-01', '2013-06-01','2013-05-01', '2013-04-01',
'2013-01-01', '2013-01-01','2013-05-01','2013-04-01'))
tbl$fecha<-as.Date(as.character(tbl$fecha))
我想我可以使用ddply(plyr软件包),所以我创建了一个函数,可以获得月的序列:
meses<-function(xMin, xMax){
seq(from=as.Date(xMin, , '%Y-%m-%d'), to=as.Date(xMax, '%Y-%m-%d'), by='month')}
存储在列表中的日期将转换为数字
我知道我可以把数字转换成日期。因此:
convFecha<-function(x){as.Date(x, origin='1970-01-01')}
我得到了想要的结果:
[[1]]
[1] "2013-01-01" "2013-02-01" "2013-03-01" "2013-04-01" "2013-05-01" "2013-06-01"
[[2]]
[1] "2013-01-01"
[[3]]
[1] "2013-01-01" "2013-02-01" "2013-03-01" "2013-04-01" "2013-05-01"
此时,我不知道如何创建最终表格。如果我尝试将此结果粘贴到我的表中,它会再次以数字形式转换日期
vf$sec1<-lapply(vf$sec, convFecha)
vf$sec1这不是一个完整的答案,而是使用by
函数的第一步
out <- by(tbl, list(tbl$id_cliente),
function(x) seq(from=as.Date(min(x$fecha), , '%Y-%m-%d'),
to=as.Date(max(x$fecha), '%Y-%m-%d'), by='month'))
> out
: 1
[1] "2013-01-01" "2013-02-01" "2013-03-01" "2013-04-01" "2013-05-01"
[6] "2013-06-01"
-------------------------------------------------------
: 2
[1] "2013-01-01"
-------------------------------------------------------
: 3
[1] "2013-01-01" "2013-02-01" "2013-03-01" "2013-04-01" "2013-05-01"
out
: 1
[1] "2013-01-01" "2013-02-01" "2013-03-01" "2013-04-01" "2013-05-01"
[6] "2013-06-01"
-------------------------------------------------------
: 2
[1] "2013-01-01"
-------------------------------------------------------
: 3
[1] "2013-01-01" "2013-02-01" "2013-03-01" "2013-04-01" "2013-05-01"
这是我的尝试
tbl <- data.frame(id_cliente = c(1, 1, 1, 1, 2, 3, 3, 3),
fecha = c('2013-01-01', '2013-06-01', '2013-05-01', '2013-04-01',
'2013-01-01', '2013-01-01', '2013-05-01', '2013-04-01'))
ddply(tbl, .(id_cliente), function(d) {
xMin <- min(as.Date(d$fecha))
xMax <- max(as.Date(d$fecha))
data.frame(fecha = format(seq(from=xMin, to=xMax, by='month'), format = "%d/%m/%Y"))
})
[[1]]
[1] "2013-01-01" "2013-02-01" "2013-03-01" "2013-04-01" "2013-05-01" "2013-06-01"
[[2]]
[1] "2013-01-01"
[[3]]
[1] "2013-01-01" "2013-02-01" "2013-03-01" "2013-04-01" "2013-05-01"
vf$sec1<-lapply(vf$sec, convFecha)
out <- by(tbl, list(tbl$id_cliente),
function(x) seq(from=as.Date(min(x$fecha), , '%Y-%m-%d'),
to=as.Date(max(x$fecha), '%Y-%m-%d'), by='month'))
> out
: 1
[1] "2013-01-01" "2013-02-01" "2013-03-01" "2013-04-01" "2013-05-01"
[6] "2013-06-01"
-------------------------------------------------------
: 2
[1] "2013-01-01"
-------------------------------------------------------
: 3
[1] "2013-01-01" "2013-02-01" "2013-03-01" "2013-04-01" "2013-05-01"
tbl <- data.frame(id_cliente = c(1, 1, 1, 1, 2, 3, 3, 3),
fecha = c('2013-01-01', '2013-06-01', '2013-05-01', '2013-04-01',
'2013-01-01', '2013-01-01', '2013-05-01', '2013-04-01'))
ddply(tbl, .(id_cliente), function(d) {
xMin <- min(as.Date(d$fecha))
xMax <- max(as.Date(d$fecha))
data.frame(fecha = format(seq(from=xMin, to=xMax, by='month'), format = "%d/%m/%Y"))
})
id_cliente fecha
1 1 01/01/2013
2 1 01/02/2013
3 1 01/03/2013
4 1 01/04/2013
5 1 01/05/2013
6 1 01/06/2013
7 2 01/01/2013
8 3 01/01/2013
9 3 01/02/2013
10 3 01/03/2013
11 3 01/04/2013
12 3 01/05/2013