用r中的最大值和最小值组织数据
我有一张这样的桌子: 由以下代码生成:用r中的最大值和最小值组织数据,r,datetime,group-by,sqldf,R,Datetime,Group By,Sqldf,我有一张这样的桌子: 由以下代码生成: id <- c("1","2","1","2","1","1") status <- c("open","open","closed","closed","open","closed") date <- c("11-10-2017 15:10","10-10-2017 12:10","12-10-2017 22:10","13-10-2017 06:30","13-10-2017 09:30","13-10-2017 10:30") d
id <- c("1","2","1","2","1","1")
status <- c("open","open","closed","closed","open","closed")
date <- c("11-10-2017 15:10","10-10-2017 12:10","12-10-2017 22:10","13-10-2017 06:30","13-10-2017 09:30","13-10-2017 10:30")
data <- data.frame(id,status,date)
hour <- data.frame(do.call('rbind', strsplit(as.character(data$date),' ',fixed=TRUE)))
hour <- hour[,2]
hour <- as.POSIXlt(hour, format = "%H:%M")
问题1:有没有更简单的方法
问题2:如果我选择
max(hour)
作为任何其他名称,而不是hour
,结果将不是日期和时间格式,而是一系列数字,如1507864200
,1507800
。如何在为列指定不同名称时保持时间格式?使用packageplyr
:
(由于某些原因,如图所示,您必须将小时转换为classas.POSIXct
,否则会收到错误消息):
#将小时添加到data.frame:
data$hour您的意思是将hour
作为数据中的一列吗?也许你忘记了一个数据$hour
sqldf("select * from (select id, status, date as closeDate, max(hour) as hour from data
where status='closed'
group by id,status) as a
join
(select id, status, date as openDate, min(hour) as hour from data
where status='open'
group by id,status) as b
using(id);")
#add hour to data.frame:
data$hour <- as.POSIXct(hour)
library(plyr)
ddply(data, .(id), summarize, open=min(hour[status=="open"]),
closed=max(hour[status=="closed"]))