R语言中的数据处理与算法
我对R相当陌生,在计算和比较R中的日期时遇到一些麻烦。基本上,我必须使用数据帧: df1试试:R语言中的数据处理与算法,r,datetime,R,Datetime,我对R相当陌生,在计算和比较R中的日期时遇到一些麻烦。基本上,我必须使用数据帧: df1试试: 库(lubridate) df1New可以显示预期的输出。最好使用dput来显示数据。例如,dput(head(data,10))顺便说一句,df1$DateTime中没有日期2013-06-02 18:00:00它是否必须按服务进行分组?@akrun,df1$DateTime持续到2014-06-01 00.00.00,所以我举了2013-06-02 18:00:00的例子。我不确定如何显示预期的输
库(lubridate)
df1New可以显示预期的输出。最好使用dput
来显示数据。例如,dput(head(data,10))
顺便说一句,df1$DateTime中没有日期2013-06-02 18:00:00
它是否必须按服务进行分组?@akrun,df1$DateTime持续到2014-06-01 00.00.00,所以我举了2013-06-02 18:00:00的例子。我不确定如何显示预期的输出,对于我发布的数据帧,结果将是微不足道的,df2的第1行到第5行的累积计数为0,第6行到第10行的累积计数为1。理想情况下,我确实希望它按服务分组,但由于我在整个过程中遇到问题,我认为对我来说,将其作为一个整体来理解会更简单,然后尝试使用ddply按服务对其进行分组,我不确定这是否有意义。非常感谢@akrun的帮助。我有一个问题,你的代码每天都在休息,所以每天的累积和从零开始。有没有办法让累积金额在第二天继续?是的,你之前是对的,我确实想通过服务来做到这一点。我将尝试调整您的代码,尽管看起来很难…@user3770767,正如我前面提到的,请显示正确的预期输出。我很抱歉没有早点回复,但是我想我用adply和subset获得了我想要的输出。由于时间紧迫,我将提供一个示例数据帧和输出,以防有人遇到类似问题。我已完全修改了我的问题,使用dput添加了数据。可以在此处找到问题:。我确实意识到我不太擅长制定问题,我对编写代码一点也不放心。如果你有空看一看,我会非常感激的。Thanks@NarT当我有时间的时候,我会看看你的数据集。
DateTime
1 2013-06-01 00:00:00
2 2013-06-01 03:00:00
3 2013-06-01 06:00:00
4 2013-06-01 09:00:00
5 2013-06-01 12:00:00
6 2013-06-01 15:00:00
7 2013-06-01 18:00:00
8 2013-06-01 21:00:00
9 2013-06-02 00:00:00
10 2013-06-02 03:00:00
Create.Date.Time Service Closing.Date.Time
1 2013-06-01 12:59:00 AV 2013-06-01 13:59:00
2 2013-06-02 07:56:00 SERVICE684793 2013-06-02 08:59:00
3 2013-06-02 09:39:00 SERVICE684793 2013-06-03 12:01:00
4 2013-06-02 14:14:00 SERVICE684796 2013-06-02 14:55:00
5 2013-06-02 17:20:00 SERVICE684797 2013-06-03 12:06:00
6 2013-06-03 07:20:00 SERVICE684793 2013-06-03 07:39:00
7 2013-06-03 08:02:00 SERVICE684839 2013-06-03 12:09:00
8 2013-06-03 08:04:00 SERVICE684841 2013-06-04 08:05:00
9 2013-06-03 08:04:00 SERVICE684841 2013-06-05 08:06:00
10 2013-06-03 08:08:00 SERVICE684841 2013-06-03 08:08:00
library(lubridate)
df1New <- within(df1, {
Createtime <- period_to_seconds(hms(strftime(DateTime, "%H:%M:%S")))
Date <- as.Date(DateTime)
})
df2New <- within(df2, {
Createtime1 <- period_to_seconds(hms(strftime(Create.Date.Time, "%H:%M:%S")))
Date <- as.Date(Create.Date.Time)
})
df1New$Num.Closed <- unsplit(lapply(split(df1New, df1New$Date), function(x) {
x2 <- df2New[df2New$Date %in% x$Date, ]
unlist(lapply(1:nrow(x), function(i) {
x1 <- x[i, ]
sum(x2$Createtime1 <= x1$Createtime)
}))
}), df1New$Date)
df1New[,-(2:3)]
# DateTime Num.Closed
#1 2013-06-01 00:00:00 0
#2 2013-06-01 03:00:00 0
#3 2013-06-01 06:00:00 0
#4 2013-06-01 09:00:00 0
#5 2013-06-01 12:00:00 0
#6 2013-06-01 15:00:00 1
#7 2013-06-01 18:00:00 1
#8 2013-06-01 21:00:00 1
#9 2013-06-02 00:00:00 0
#10 2013-06-02 03:00:00 0
df1 <- structure(list(DateTime = c("2013-06-01 00:00:00", "2013-06-01 03:00:00",
"2013-06-01 06:00:00", "2013-06-01 09:00:00", "2013-06-01 12:00:00",
"2013-06-01 15:00:00", "2013-06-01 18:00:00", "2013-06-01 21:00:00",
"2013-06-02 00:00:00", "2013-06-02 03:00:00")), .Names = "DateTime", class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10"))
df2 <- structure(list(Create.Date.Time = c("2013-06-01 12:59:00", "2013-06-02 07:56:00",
"2013-06-02 09:39:00", "2013-06-02 14:14:00", "2013-06-02 17:20:00",
"2013-06-03 07:20:00", "2013-06-03 08:02:00", "2013-06-03 08:04:00",
"2013-06-03 08:04:00", "2013-06-03 08:08:00"), Service = c("AV",
"SERVICE684793", "SERVICE684793", "SERVICE684796", "SERVICE684797",
"SERVICE684793", "SERVICE684839", "SERVICE684841", "SERVICE684841",
"SERVICE684841"), Closing.Date.Time = c("2013-06-01 13:59:00",
"2013-06-02 08:59:00", "2013-06-03 12:01:00", "2013-06-02 14:55:00",
"2013-06-03 12:06:00", "2013-06-03 07:39:00", "2013-06-03 12:09:00",
"2013-06-04 08:05:00", "2013-06-05 08:06:00", "2013-06-03 08:08:00"
)), .Names = c("Create.Date.Time", "Service", "Closing.Date.Time"
), class = "data.frame", row.names = c("1", "2", "3", "4", "5",
"6", "7", "8", "9", "10"))