如何计算R中具有相同值的连续变量的持续时间

如何计算R中具有相同值的连续变量的持续时间,r,R,我想计算一个红绿灯在每个交通周期中相对以绿色、琥珀色和红色表示的持续时间(在我的示例数据中,列sg.0),例如,要计算每个周期中从第一个绿色状态到最后一个绿色状态的所有时间长度,我该怎么做? Data.frame如下所示: time sg. 0 1 2014-09-01 00:00:12.0 green 2 2014-09-01 00:00:13.5 green 3 2014-09-01 00:00:30.0 amber 4 2014-09-01 00:00:30.0 ambe

我想计算一个红绿灯在每个交通周期中相对以绿色、琥珀色和红色表示的持续时间(在我的示例数据中,列
sg.0
),例如,要计算每个周期中从第一个绿色状态到最后一个绿色状态的所有时间长度,我该怎么做? Data.frame如下所示:

 time sg. 0
1   2014-09-01 00:00:12.0 green
2   2014-09-01 00:00:13.5 green
3   2014-09-01 00:00:30.0 amber
4   2014-09-01 00:00:30.0 amber
5   2014-09-01 00:00:31.5 amber
6   2014-09-01 00:00:32.0 amber
7   2014-09-01 00:00:32.2 amber
8   2014-09-01 00:00:33.5 amber
9   2014-09-01 00:00:33.0   red
10  2014-09-01 00:00:35.0   red
11  2014-09-01 00:00:35.2   red
12  2014-09-01 00:00:37.0   red
13  2014-09-01 00:00:41.0   red
14  2014-09-01 00:00:42.0   red
15  2014-09-01 00:00:42.2   red
16  2014-09-01 00:00:43.0   red
17  2014-09-01 00:00:44.7   red
18  2014-09-01 00:00:44.2   red
19  2014-09-01 00:00:45.5   red
20  2014-09-01 00:00:47.0   red
21  2014-09-01 00:00:48.7   red
22  2014-09-01 00:00:49.7   red
23  2014-09-01 00:00:49.7   red
24  2014-09-01 00:00:49.9   red
25  2014-09-01 00:00:50.9 green
26  2014-09-01 00:00:50.0 green
27  2014-09-01 00:00:52.0 green
28  2014-09-01 00:00:53.0 green
29  2014-09-01 00:00:54.0 green
30  2014-09-01 00:00:55.0 green
31  2014-09-01 00:00:55.0 green
32  2014-09-01 00:01:02.0 green
33  2014-09-01 00:01:03.7 green
34  2014-09-01 00:01:05.7 green
35  2014-09-01 00:01:07.0 green
原始数据:

structure(list(time = structure(c(1409518812, 1409518813.6, 1409518830, 
1409518830.1, 1409518831.6, 1409518832, 1409518832.2, 1409518833.6, 
1409518833, 1409518835, 1409518835.3, 1409518837, 1409518841, 
1409518842, 1409518842.3, 1409518843, 1409518844.8, 1409518844.2, 
1409518845.6, 1409518847, 1409518848.7, 1409518849.7, 1409518849.8, 
1409518849.9, 1409518850.9, 1409518850, 1409518852, 1409518853, 
1409518854, 1409518855, 1409518855.1, 1409518862, 1409518863.8, 
1409518865.8, 1409518867, 1409518868, 1409518870.7, 1409518870.3, 
1409518884, 1409518884.2, 1409518884.3, 1409518884.5, 1409518890, 
1409518942, 1409518942.1, 1409518943.7, 1409518943.3, 1409518944.9, 
1409518944, 1409518945, 1409518947, 1409518949.5, 1409518949.6, 
1409518953, 1409518954, 1409518957.8, 1409518957.2, 1409518961, 
1409518961.1, 1409518961.2, 1409518962.2, 1409518962.3, 1409518964, 
1409518965, 1409518966, 1409518967, 1409518967.1, 1409518974, 
1409518975.8, 1409518977.8, 1409518979, 1409518980, 1409519068, 
1409519068.1, 1409519068.7, 1409519070, 1409519071, 1409519073, 
1409519073.8, 1409519081, 1409519082, 1409519083.3, 1409519083.8, 
1409519084.7, 1409519086, 1409519087.6, 1409519089.2, 1409519089.3, 
1409519091, 1409519091.1, 1409519091.6, 1409519092, 1409519092.1, 
1409519093, 1409519094, 1409519094.5, 1409519095, 1409519095.1, 
1409519103, 1409519104), class = c("POSIXct", "POSIXt")), `sg. 0` = structure(c(2L, 
2L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 
2L, 2L, 2L), .Label = c("amber", "green", "red"), class = "factor")), .Names = c("time", 
"sg. 0"), row.names = c(NA, 100L), class = "data.frame")

您可能希望首先唯一地标识每个颜色周期,然后可以收集每个组的统计信息。你可以找到一个循环

cycle<-cumsum(c(FALSE, dd[-1,2] != dd[-nrow(dd),2]))

green.0 amber.1   red.2 green.3 amber.4   red.5 green.6 amber.7   red.8 green.9 
    1.6     3.6    16.9    40.0     2.9    16.2    17.8     2.0    23.5     9.0 
        0         1         2         3 
0.6316667 1.8533333 2.2050000 0.1500000
或者,如果你指的是绿色/琥珀色/红色周期,你可以这样做

cycle<-cumsum(c(dd[1,2]!="green", dd[-1,2] == "green" & dd[-nrow(dd),2] !="green"))
tapply(dd[,1], cycle, function(x) as.double(diff(range(x)), units="mins"))

与MrFlick的方法类似,您可以使用
rle
首先为每个颜色周期生成一个指示器,然后使用该指示器计算持续时间

# If you want to calculate the time within each colour
r <- rle(as.numeric(dat$sg.0))
r$values <- seq_along(r$values)
dat$id <- inverse.rle(r)

(a <- aggregate(time ~ sg.0 + id, dat, function(i) diff(as.numeric(range(i)))))
#    sg.0 id time
#1  green  1  1.6
#2  amber  2  3.6
#3    red  3 16.9
# ...

# Use a similar approach, if the cycle is for each green/amber/red
r <- rle(as.numeric(dat$sg.0))
r$values <- rep(seq_along(r$values), each=3, length=length(r$values))
dat$cycle <- inverse.rle(r)

 (b <- aggregate(time ~ cycle, dat, function(i) diff(as.numeric(range(i)))))
#  cycle  time
#1     1  37.9
#2     2 111.2
#3     3 132.3
#4     4   9.0
#如果要计算每种颜色内的时间

r谢谢你的周到回答,唯一的问题是绿色/琥珀色/红色相位的完整顺序,这个单位不是唯一的,小于一分钟的值单位是秒,而大于一分钟的值转换为分钟。谢谢,你说得很对-我错过了。我做了一个小编辑。嗨,@user20650。很抱歉打扰你,我刚刚意识到我理解这个问题有误。持续时间应从当前颜色的最早时间到下一颜色的最早时间。但是我想不出一个办法来改变你的解决方案,你能帮我一把吗;使用第一个
rle
方法计算
id
:然后可以使用
(a)提取每个灯光变化的第一个值
# If you want to calculate the time within each colour
r <- rle(as.numeric(dat$sg.0))
r$values <- seq_along(r$values)
dat$id <- inverse.rle(r)

(a <- aggregate(time ~ sg.0 + id, dat, function(i) diff(as.numeric(range(i)))))
#    sg.0 id time
#1  green  1  1.6
#2  amber  2  3.6
#3    red  3 16.9
# ...

# Use a similar approach, if the cycle is for each green/amber/red
r <- rle(as.numeric(dat$sg.0))
r$values <- rep(seq_along(r$values), each=3, length=length(r$values))
dat$cycle <- inverse.rle(r)

 (b <- aggregate(time ~ cycle, dat, function(i) diff(as.numeric(range(i)))))
#  cycle  time
#1     1  37.9
#2     2 111.2
#3     3 132.3
#4     4   9.0