R中特定案例的子集和组
我有这样一个数据帧[df]:R中特定案例的子集和组,r,dataframe,R,Dataframe,我有这样一个数据帧[df]: id device date speed incident 1 B3 2020-04-15 08:00 23 0 2 B3 2020-04-15 09:00 21 0 3 B3 2020-04-15 10:00 54 1 4 B3 2020-04-15 11:00 52
id device date speed incident
1 B3 2020-04-15 08:00 23 0
2 B3 2020-04-15 09:00 21 0
3 B3 2020-04-15 10:00 54 1
4 B3 2020-04-15 11:00 52 1
5 B3 2020-04-15 12:00 24 0
6 B3 2020-04-15 13:00 12 0
7 B3 2020-04-16 09:00 51 1
8 B3 2020-04-16 10:00 16 0
9 B3 2020-04-16 11:00 20 0
10 B3 2020-04-16 12:00 21 0
11 B3 2020-04-16 13:00 19 0
id device date pressure warning group
2 B3 2020-04-15 09:00 21 0 1
3 B3 2020-04-15 10:00 54 1 1
4 B3 2020-04-15 11:00 52 1 1
5 B3 2020-04-15 12:00 24 0 1
6 B3 2020-04-15 13:00 12 0 2
7 B3 2020-04-16 09:00 51 1 2
8 B3 2020-04-16 10:00 16 0 2
我想知道是否有一种方法可以对数据进行子集划分,以便只有事件=1的行与事件前后的行保持一致,并为每个事件组分配一个id
首选结果如下所示:
id device date speed incident
1 B3 2020-04-15 08:00 23 0
2 B3 2020-04-15 09:00 21 0
3 B3 2020-04-15 10:00 54 1
4 B3 2020-04-15 11:00 52 1
5 B3 2020-04-15 12:00 24 0
6 B3 2020-04-15 13:00 12 0
7 B3 2020-04-16 09:00 51 1
8 B3 2020-04-16 10:00 16 0
9 B3 2020-04-16 11:00 20 0
10 B3 2020-04-16 12:00 21 0
11 B3 2020-04-16 13:00 19 0
id device date pressure warning group
2 B3 2020-04-15 09:00 21 0 1
3 B3 2020-04-15 10:00 54 1 1
4 B3 2020-04-15 11:00 52 1 1
5 B3 2020-04-15 12:00 24 0 1
6 B3 2020-04-15 13:00 12 0 2
7 B3 2020-04-16 09:00 51 1 2
8 B3 2020-04-16 10:00 16 0 2
非常感谢您的建议。这里有一个基本的R方法:
#Get row numbers where incident = 1
ones <- which(df$incident == 1)
#Create groups of consecutive ones
inds <- split(ones,cumsum(c(TRUE, diff(ones) > 1)))
#subset the dataframe by taking -1, +1 of inds in each list
#Create a group column and combine the data in one dataframe
do.call(rbind, Map(function(x, y)
transform(df[c(min(x) - 1, x, max(x) + 1), ], group = y),
inds, names(inds)))
# id device date speed incident group
#1.2 2 B3 2020-04-1509:00 21 0 1
#1.3 3 B3 2020-04-1510:00 54 1 1
#1.4 4 B3 2020-04-1511:00 52 1 1
#1.5 5 B3 2020-04-1512:00 24 0 1
#2.6 6 B3 2020-04-1513:00 12 0 2
#2.7 7 B3 2020-04-1609:00 51 1 2
#2.8 8 B3 2020-04-1610:00 16 0 2
#获取事件=1的行号
一个