R中的日期差计算

R中的日期差计算,r,date,time,R,Date,Time,我有奇怪的格式化日期和时间数据,需要计算R的差异。非常感谢您的帮助。谢谢 TimeStart TimeEnd May 1 2016 1:00AM May 1 2016 1:28AM May 1 2016 1:01AM May 1 2016 1:21AM May 1 2016 1:00PM May 1 2016 1:13PM May 1 2016 1:00PM May 4 2016 5:42PM May 1 2016 1:02PM May

我有奇怪的格式化日期和时间数据,需要计算R的差异。非常感谢您的帮助。谢谢

TimeStart           TimeEnd
May  1 2016  1:00AM May  1 2016  1:28AM
May  1 2016  1:01AM May  1 2016  1:21AM
May  1 2016  1:00PM May  1 2016  1:13PM
May  1 2016  1:00PM May  4 2016  5:42PM
May  1 2016  1:02PM May  1 2016  1:37PM
May  1 2016  1:02PM May  1 2016  1:14PM
May  1 2016  1:02PM May  1 2016  1:39PM
May  1 2016  1:02PM May  1 2016  1:18PM 

查看
?strtime
,了解如何格式化日期/时间对象

library(data.table)
dat <- read.table(text = "May  1 2016  1:00AM May  1 2016  1:28AM
                   May  1 2016  1:01AM May  1 2016  1:21AM
                   May  1 2016  1:00PM May  1 2016  1:13PM
                   May  1 2016  1:00PM May  4 2016  5:42PM
                   May  1 2016  1:02PM May  1 2016  1:37PM
                   May  1 2016  1:02PM May  1 2016  1:14PM
                   May  1 2016  1:02PM May  1 2016  1:39PM
                   May  1 2016  1:02PM May  1 2016  1:18PM")

dat2 <- setDT(dat)[ , list(start = paste(V1, V2, V3, V4),
                           end = paste(V5, V6, V7, V8))]
dat2[] <- lapply(dat2, as.POSIXct, format = "%B %d %Y %H:%M%p")
dat2[ , diff := end - start]
dat2
#                  start                 end      diff
# 1: 2016-05-01 01:00:00 2016-05-01 01:28:00   28 mins
# 2: 2016-05-01 01:01:00 2016-05-01 01:21:00   20 mins
# 3: 2016-05-01 01:00:00 2016-05-01 01:13:00   13 mins
# 4: 2016-05-01 01:00:00 2016-05-04 05:42:00 4602 mins
# 5: 2016-05-01 01:02:00 2016-05-01 01:37:00   35 mins
# 6: 2016-05-01 01:02:00 2016-05-01 01:14:00   12 mins
# 7: 2016-05-01 01:02:00 2016-05-01 01:39:00   37 mins
# 8: 2016-05-01 01:02:00 2016-05-01 01:18:00   16 mins
库(data.table)
dat在dplyr中

library(dplyr)

       # parse datetimes
df %>% mutate_all(as.POSIXct, format = '%b %d %Y %I:%M%p') %>% 
    # add column with time difference
    mutate(elapsed = TimeEnd - TimeStart)

##              TimeStart             TimeEnd   elapsed
## 1 2016-05-01 01:00:00 2016-05-01 01:28:00   28 mins
## 2 2016-05-01 01:01:00 2016-05-01 01:21:00   20 mins
## 3 2016-05-01 13:00:00 2016-05-01 13:13:00   13 mins
## 4 2016-05-01 13:00:00 2016-05-04 17:42:00 4602 mins
## 5 2016-05-01 13:02:00 2016-05-01 13:37:00   35 mins
## 6 2016-05-01 13:02:00 2016-05-01 13:14:00   12 mins
## 7 2016-05-01 13:02:00 2016-05-01 13:39:00   37 mins
## 8 2016-05-01 13:02:00 2016-05-01 13:18:00   16 mins
或以R为基数进行等效

df$TimeStart <- as.POSIXct(df$TimeStart, format = '%b %d %Y %I:%M%p')
df$TimeEnd <- as.POSIXct(df$TimeEnd, format = '%b %d %Y %I:%M%p')
df$elapsed <- df$TimeEnd - df$TimeStart

df
##              TimeStart             TimeEnd   elapsed
## 1 2016-05-01 01:00:00 2016-05-01 01:28:00   28 mins
## 2 2016-05-01 01:01:00 2016-05-01 01:21:00   20 mins
## 3 2016-05-01 13:00:00 2016-05-01 13:13:00   13 mins
## 4 2016-05-01 13:00:00 2016-05-04 17:42:00 4602 mins
## 5 2016-05-01 13:02:00 2016-05-01 13:37:00   35 mins
## 6 2016-05-01 13:02:00 2016-05-01 13:14:00   12 mins
## 7 2016-05-01 13:02:00 2016-05-01 13:39:00   37 mins
## 8 2016-05-01 13:02:00 2016-05-01 13:18:00   16 mins

df$TimeStart我更喜欢使用lubridate来处理这样的事情。它是一个简单的包,可以使用一致的命名方案来解析日期时间

library(lubridate)
首先使用
mdy\u hm

df2 <- apply(df, 2, mdy_hm)
结果是这样的

[1] "1680s (~28 minutes)"     "1200s (~20 minutes)"    
[3] "780s (~13 minutes)"      "276120s (~4602 minutes)"
[5] "2100s (~35 minutes)"     "720s (~12 minutes)"     
[7] "2220s (~37 minutes)"     "960s (~16 minutes)" 

你能补充一些关于你的数据的澄清吗?可以运行一个
dput
,这样我们就可以看到列是如何定义的?
dseconds(df2[,2]-df2[,1])
[1] "1680s (~28 minutes)"     "1200s (~20 minutes)"    
[3] "780s (~13 minutes)"      "276120s (~4602 minutes)"
[5] "2100s (~35 minutes)"     "720s (~12 minutes)"     
[7] "2220s (~37 minutes)"     "960s (~16 minutes)"