R 带有值的时间戳，定义就诊和就诊时间_R_Timestamp

R 带有值的时间戳，定义就诊和就诊时间

R 带有值的时间戳，定义就诊和就诊时间,r,timestamp,R,Timestamp,我目前正在处理一个包含sensordata的数据集。我想得到一些简要的统计数字。更确切地说，我希望得到访问次数和总入住时间。如果在具有值1的时间戳之后X分钟内有多个0值，则定义一次就诊我的数据是这样的 SensorId timestamp value 1 10:10:10 1 1 10:12:10 1 1 10:14:0

我目前正在处理一个包含sensordata的数据集。我想得到一些简要的统计数字。更确切地说，我希望得到访问次数和总入住时间。如果在具有值1的时间戳之后X分钟内有多个0值，则定义一次就诊

我的数据是这样的

SensorId          timestamp          value
1                 10:10:10            1
1                 10:12:10            1
1                 10:14:00            1
1                 10:16:00            0
1                 10:18:00            0
1                 10:20:00            0
2                 13:10:10            1
2                 13:12:10            1
2                 13:14:00            1
2                 13:20:00            1
2                 13:22:00            0

这是我想要的结果：

SensorId          total time in use          Number of visits
1                 4                             1
2                 10                            1

有相当多的行，所以我希望得到使用的总时间，以及每次更新的访问次数。

我们可以将

时间戳

转换为

POSIXct

类，

排列它们，按SensorId
和连续相似的值对它们进行分组，并用第一个值减去最后一个时间戳
library(dplyr)

df %>%
 mutate(timestamp = as.POSIXct(timestamp, format = "%T")) %>%
 arrange(SensorId, timestamp) %>%
 group_by(SensorId, grp = data.table::rleid(value)) %>%
 summarise(total_time = round(last(timestamp) - first(timestamp)), 
           number_of_visit = first(value)) %>%
 filter(number_of_visit == 1) %>%
 select(-grp)

#  SensorId total_time number_of_visit
#     <int> <drtn>               <int>
#1        1  4 mins                  1
#2        2 10 mins                  1

库（dplyr）
df%>%
mutate（timestamp=as.POSIXct（timestamp，format=“%T”））%>%
排列（传感器ID，时间戳）%>%
分组依据（传感器ID，grp=data.table:：rleid（值））%>%
总结（总时间=四舍五入（最后一个（时间戳）-第一个（时间戳）），
访问次数=第一次（值））%>%
筛选器（访问次数==1）%>%
选择（-grp）
#SensorId总访问次数
#                     
#14分钟1
#2 10分钟1
请添加一个。这样你可以帮助别人来帮助你！此外，这意味着什么：>一次访问由无时间戳定义，或由X分钟内的几个0值定义。