R中的日期差计算
我有奇怪的格式化日期和时间数据,需要计算R的差异。非常感谢您的帮助。谢谢R中的日期差计算,r,date,time,R,Date,Time,我有奇怪的格式化日期和时间数据,需要计算R的差异。非常感谢您的帮助。谢谢 TimeStart TimeEnd May 1 2016 1:00AM May 1 2016 1:28AM May 1 2016 1:01AM May 1 2016 1:21AM May 1 2016 1:00PM May 1 2016 1:13PM May 1 2016 1:00PM May 4 2016 5:42PM May 1 2016 1:02PM May
TimeStart TimeEnd
May 1 2016 1:00AM May 1 2016 1:28AM
May 1 2016 1:01AM May 1 2016 1:21AM
May 1 2016 1:00PM May 1 2016 1:13PM
May 1 2016 1:00PM May 4 2016 5:42PM
May 1 2016 1:02PM May 1 2016 1:37PM
May 1 2016 1:02PM May 1 2016 1:14PM
May 1 2016 1:02PM May 1 2016 1:39PM
May 1 2016 1:02PM May 1 2016 1:18PM
查看
?strtime
,了解如何格式化日期/时间对象
library(data.table)
dat <- read.table(text = "May 1 2016 1:00AM May 1 2016 1:28AM
May 1 2016 1:01AM May 1 2016 1:21AM
May 1 2016 1:00PM May 1 2016 1:13PM
May 1 2016 1:00PM May 4 2016 5:42PM
May 1 2016 1:02PM May 1 2016 1:37PM
May 1 2016 1:02PM May 1 2016 1:14PM
May 1 2016 1:02PM May 1 2016 1:39PM
May 1 2016 1:02PM May 1 2016 1:18PM")
dat2 <- setDT(dat)[ , list(start = paste(V1, V2, V3, V4),
end = paste(V5, V6, V7, V8))]
dat2[] <- lapply(dat2, as.POSIXct, format = "%B %d %Y %H:%M%p")
dat2[ , diff := end - start]
dat2
# start end diff
# 1: 2016-05-01 01:00:00 2016-05-01 01:28:00 28 mins
# 2: 2016-05-01 01:01:00 2016-05-01 01:21:00 20 mins
# 3: 2016-05-01 01:00:00 2016-05-01 01:13:00 13 mins
# 4: 2016-05-01 01:00:00 2016-05-04 05:42:00 4602 mins
# 5: 2016-05-01 01:02:00 2016-05-01 01:37:00 35 mins
# 6: 2016-05-01 01:02:00 2016-05-01 01:14:00 12 mins
# 7: 2016-05-01 01:02:00 2016-05-01 01:39:00 37 mins
# 8: 2016-05-01 01:02:00 2016-05-01 01:18:00 16 mins
库(data.table)
dat在dplyr中
library(dplyr)
# parse datetimes
df %>% mutate_all(as.POSIXct, format = '%b %d %Y %I:%M%p') %>%
# add column with time difference
mutate(elapsed = TimeEnd - TimeStart)
## TimeStart TimeEnd elapsed
## 1 2016-05-01 01:00:00 2016-05-01 01:28:00 28 mins
## 2 2016-05-01 01:01:00 2016-05-01 01:21:00 20 mins
## 3 2016-05-01 13:00:00 2016-05-01 13:13:00 13 mins
## 4 2016-05-01 13:00:00 2016-05-04 17:42:00 4602 mins
## 5 2016-05-01 13:02:00 2016-05-01 13:37:00 35 mins
## 6 2016-05-01 13:02:00 2016-05-01 13:14:00 12 mins
## 7 2016-05-01 13:02:00 2016-05-01 13:39:00 37 mins
## 8 2016-05-01 13:02:00 2016-05-01 13:18:00 16 mins
或以R为基数进行等效
df$TimeStart <- as.POSIXct(df$TimeStart, format = '%b %d %Y %I:%M%p')
df$TimeEnd <- as.POSIXct(df$TimeEnd, format = '%b %d %Y %I:%M%p')
df$elapsed <- df$TimeEnd - df$TimeStart
df
## TimeStart TimeEnd elapsed
## 1 2016-05-01 01:00:00 2016-05-01 01:28:00 28 mins
## 2 2016-05-01 01:01:00 2016-05-01 01:21:00 20 mins
## 3 2016-05-01 13:00:00 2016-05-01 13:13:00 13 mins
## 4 2016-05-01 13:00:00 2016-05-04 17:42:00 4602 mins
## 5 2016-05-01 13:02:00 2016-05-01 13:37:00 35 mins
## 6 2016-05-01 13:02:00 2016-05-01 13:14:00 12 mins
## 7 2016-05-01 13:02:00 2016-05-01 13:39:00 37 mins
## 8 2016-05-01 13:02:00 2016-05-01 13:18:00 16 mins
df$TimeStart我更喜欢使用lubridate来处理这样的事情。它是一个简单的包,可以使用一致的命名方案来解析日期时间
library(lubridate)
首先使用mdy\u hm
df2 <- apply(df, 2, mdy_hm)
结果是这样的
[1] "1680s (~28 minutes)" "1200s (~20 minutes)"
[3] "780s (~13 minutes)" "276120s (~4602 minutes)"
[5] "2100s (~35 minutes)" "720s (~12 minutes)"
[7] "2220s (~37 minutes)" "960s (~16 minutes)"
你能补充一些关于你的数据的澄清吗?可以运行一个dput
,这样我们就可以看到列是如何定义的?
dseconds(df2[,2]-df2[,1])
[1] "1680s (~28 minutes)" "1200s (~20 minutes)"
[3] "780s (~13 minutes)" "276120s (~4602 minutes)"
[5] "2100s (~35 minutes)" "720s (~12 minutes)"
[7] "2220s (~37 minutes)" "960s (~16 minutes)"