R 筛选特定行后的数据
我有这样一个数据集:R 筛选特定行后的数据,r,R,我有这样一个数据集: id type value 1 001 0 1991 2 001 0 1992 3 001 1 1993 4 001 1 1994 5 002 1 1992 6 002 1 1993 7 003 0 1999 8 003 1 2000 9 003
id type value
1 001 0 1991
2 001 0 1992
3 001 1 1993
4 001 1 1994
5 002 1 1992
6 002 1 1993
7 003 0 1999
8 003 1 2000
9 003 0 2001
我想选择数据集上类型为1
的第一行之后的行
最终预期结果如下:
id type value
3 001 1 1993
4 001 1 1994
5 002 1 1992
6 002 1 1993
8 003 1 2000
9 003 0 2001
我知道它首先按id
进行分组。但我不知道下一步该怎么做
有人有什么建议吗?关于
dplyr
:
library(dplyr)
df %>%
group_by(id) %>%
mutate(sel = cumsum(type)) %>%
filter(sel > 0) %>%
select(id, type, value)
结果是:
# A tibble: 6 x 3
# Groups: id [3]
id type value
<int> <int> <int>
1 1 1 1993
2 1 1 1994
3 2 1 1992
4 2 1 1993
5 3 1 2000
6 3 0 2001
如果每组
id
的cumsum
等于或大于1(当然大于0),则可以将数据子集为值
在base R中
idx <- as.logical(with(DF, ave(type, id, FUN = function(x) cumsum(x) >= 1)))
DF[idx, ]
# id type value
#3 1 1 1993
#4 1 1 1994
#5 2 1 1992
#6 2 1 1993
#8 3 1 2000
#9 3 0 2001
数据
DF <- structure(list(id = c(1L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L), type = c(0L,
0L, 1L, 1L, 1L, 1L, 0L, 1L, 0L), value = c(1991L, 1992L, 1993L,
1994L, 1992L, 1993L, 1999L, 2000L, 2001L)), .Names = c("id",
"type", "value"), class = "data.frame", row.names = c("1", "2",
"3", "4", "5", "6", "7", "8", "9"))
DF
library(data.table)
setDT(DF)[DF[, .I[cumsum(type) > 0], by = id]$V1]
DF <- structure(list(id = c(1L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L), type = c(0L,
0L, 1L, 1L, 1L, 1L, 0L, 1L, 0L), value = c(1991L, 1992L, 1993L,
1994L, 1992L, 1993L, 1999L, 2000L, 2001L)), .Names = c("id",
"type", "value"), class = "data.frame", row.names = c("1", "2",
"3", "4", "5", "6", "7", "8", "9"))