cumsum is.na,带rle忽略连续性na';s
简单的问题。假设我有以下数据:cumsum is.na,带rle忽略连续性na';s,r,dplyr,sequence,seq,run-length-encoding,R,Dplyr,Sequence,Seq,Run Length Encoding,简单的问题。假设我有以下数据: library(tidyverse) df <- data.frame(group = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2), variable = c(NA, "a", NA, "b", "c", NA, NA, NA, NA, "a", NA, "c", NA, NA, "d", NA, NA, "a")) df group vari
library(tidyverse)
df <- data.frame(group = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2),
variable = c(NA, "a", NA, "b", "c", NA, NA, NA, NA, "a", NA, "c", NA, NA, "d", NA, NA, "a"))
df
group variable
1 1 <NA>
2 1 a
3 1 <NA>
4 1 b
5 1 c
6 1 <NA>
7 1 <NA>
8 1 <NA>
9 1 <NA>
10 1 a
11 1 <NA>
12 1 c
13 1 <NA>
14 1 <NA>
15 1 d
16 2 <NA>
17 2 <NA>
18 2 a
我想我需要将rle
合并到我的代码中:
df %>%
group_by(group, na_group = {na_group = rle(variable); rep(seq_along(na_group$lengths), na_group$lengths)}) %>%
mutate(newvariable = cumsum((is.na(variable)))) #?
也许map
over group可以起作用。有什么建议吗
参考文献:
另一个选项是在逻辑向量上使用
diff
和cumsum
library(data.table)
setDT(df)[, new := cumsum(c(TRUE, diff(is.na(variable)) > 0) ), group ]
或使用
dplyr
library(dplyr)
df %>%
group_by(group) %>%
mutate(new = cumsum(c(TRUE, diff(is.na(variable)) > 0)))
# A tibble: 18 x 3
# Groups: group [2]
# group variable new
# <dbl> <fct> <int>
# 1 1 <NA> 1
# 2 1 a 1
# 3 1 <NA> 2
# 4 1 b 2
# 5 1 c 2
# 6 1 <NA> 3
# 7 1 <NA> 3
# 8 1 <NA> 3
# 9 1 <NA> 3
#10 1 a 3
#11 1 <NA> 4
#12 1 c 4
#13 1 <NA> 5
#14 1 <NA> 5
#15 1 d 5
#16 2 <NA> 1
#17 2 <NA> 1
#18 2 a 1
库(dplyr)
df%>%
分组依据(分组)%>%
变异(new=cumsum(c(TRUE,diff(is.na(variable))>0)))
#一个tibble:18x3
#分组:分组[2]
#组变量新
#
# 1 1 1
#2 1 a 1
# 3 1 2
#4 1 b 2
#5 1 c 2
# 6 1 3
# 7 1 3
# 8 1 3
# 9 1 3
#101A3
#11 1 4
#12 1 c 4
#13 1 5
#14 1 5
#15 1 d 5
#16 2 1
#17 2 1
#18 2 a 1
也相关:和
library(data.table)
setDT(df)[, new := cumsum(c(TRUE, diff(is.na(variable)) > 0) ), group ]
library(dplyr)
df %>%
group_by(group) %>%
mutate(new = cumsum(c(TRUE, diff(is.na(variable)) > 0)))
# A tibble: 18 x 3
# Groups: group [2]
# group variable new
# <dbl> <fct> <int>
# 1 1 <NA> 1
# 2 1 a 1
# 3 1 <NA> 2
# 4 1 b 2
# 5 1 c 2
# 6 1 <NA> 3
# 7 1 <NA> 3
# 8 1 <NA> 3
# 9 1 <NA> 3
#10 1 a 3
#11 1 <NA> 4
#12 1 c 4
#13 1 <NA> 5
#14 1 <NA> 5
#15 1 d 5
#16 2 <NA> 1
#17 2 <NA> 1
#18 2 a 1