R 如何在由其他列定义的组中对列中的值进行索引?

R 如何在由其他列定义的组中对列中的值进行索引?,r,dataframe,indexing,data.table,R,Dataframe,Indexing,Data.table,我有一个数据帧: dput(df) structure(list(ID = c("A1", "A1", "A1", "A1", "A1", "A1", "B2", "B2", "B2", "B2", "B2", "B2", "B2", &q

我有一个数据帧:

dput(df)
structure(list(ID = c("A1", "A1", "A1", "A1", "A1", "A1", "B2",
"B2", "B2", "B2", "B2", "B2", "B2", "B2", "B2", "B2"), operation = c("open",
"open", "close", "", "open", "close", "", "open", "open", "open",
"close", "upload", "open", "close", "open", "close")), class = "data.frame", row.names = c(NA,
-16L))
我想为列操作中的每个“打开”和“关闭”包添加索引。因此,对于打开和关闭之间的每一行,必须具有相同的索引。因此,期望的结果是:

ID      operation    index
A1       open         1
A1       open         1
A1       close        1
A1       
A1       open         2
A1       close        2
B2      
B2       open         3
B2       open         3
B2       open         3
B2       close        3
B2       upload
B2       open         4
B2       close        4
B2       open         5
B2       close        5
ID      operation    index
A1       open         1
A1       open         1
A1       close        1
A1       
A1       open         2
A1       close        2
B2      
B2       open         1
B2       open         1
B2       open         1
B2       close        1
B2       upload
B2       open         2
B2       close        2
B2       open         3
B2       close        3
我是这样做的:

rank <- df$operation == "close" & !is.na(df$operation)
df$index <- cumsum(c(1, rank[-length(rank)]))
df$index[!df$operation %in% c("open", "close")] <- NA

我该怎么做呢?

这里有一种使用
数据的方法。表

library(data.table)

setDT(df)
df[!is.na(index), index := rleid(index), by = .(ID)]
df
#     ID operation index
#  1: A1      open     1
#  2: A1      open     1
#  3: A1     close     1
#  4: A1              NA
#  5: A1      open     2
#  6: A1     close     2
#  7: B2              NA
#  8: B2      open     1
#  9: B2      open     1
# 10: B2      open     1
# 11: B2     close     1
# 12: B2    upload    NA
# 13: B2      open     2
# 14: B2     close     2
# 15: B2      open     3
# 16: B2     close     3

有名为
dt
df
的内置R函数。虽然R可以区分变量和函数,但最好使用其他名称

library(data.table)

setDT(df)
df[!is.na(index), index := rleid(index), by = .(ID)]
df
#     ID operation index
#  1: A1      open     1
#  2: A1      open     1
#  3: A1     close     1
#  4: A1              NA
#  5: A1      open     2
#  6: A1     close     2
#  7: B2              NA
#  8: B2      open     1
#  9: B2      open     1
# 10: B2      open     1
# 11: B2     close     1
# 12: B2    upload    NA
# 13: B2      open     2
# 14: B2     close     2
# 15: B2      open     3
# 16: B2     close     3