R 计算字母表在特定字母之前出现的次数
我想在每个Id级别出现第一个“C”之前计算“I”的出现次数。我尝试过这段代码,但可以计算列中的所有“I”。 我试过的代码R 计算字母表在特定字母之前出现的次数,r,aggregate,R,Aggregate,我想在每个Id级别出现第一个“C”之前计算“I”的出现次数。我尝试过这段代码,但可以计算列中的所有“I”。 我试过的代码 library(plyr) Impres = ddply(df, .(Id), summarize, No_of_I_before_First_C = length(which(Character == "I"))) 样本数据 Id Character 1 I 1 I 1 C 1 I 2 I 2 C 输出应该是这样的 Id
library(plyr)
Impres = ddply(df, .(Id), summarize, No_of_I_before_First_C = length(which(Character == "I")))
样本数据
Id Character
1 I
1 I
1 C
1 I
2 I
2 C
输出应该是这样的
Id Count_Of_I_before_First_C
1 2
2 1
这里有一个想法
first1 <- function(x, letter){
which(x == letter)[1]-1
}
aggregate(Character ~ Id, df, first1, 'C')
# Id Character
#1 1 2
#2 2 1
first1
结果:
# A tibble: 2 × 2
Id Count_Of_I_before_First_C
<dbl> <int>
1 1 2
2 2 1
#一个tible:2×2
在第一次之前的Id计数
1 1 2
2 2 1
以下是数据。表解决方案:
library(data.table)
dt <- data.table(Id = c(1,1,1,1,2,2), Character = c('I', 'I', 'C', 'I', 'I', 'C'))
dt[, cnt.c := cumsum(Character == "C"), by = Id]
res <- dt[cnt.c == 0, .(Count_Of_I_before_First_C = length(Character)), by = Id]
库(data.table)
dt也许:
library(dplyr)
rlei <- function(x) {
r <- rle(x)
I <- which(r$values=="I")
C <- which(r$values=="C")
r$lengths[which(I<C)][1]
}
group_by(df, Id) %>%
summarise(Count_Of_I_before_First_C=rlei(.$Character))
库(dplyr)
rlei这将是相当大的缓慢dataset@Bulat我只是跟着问题的agregate
标签走(即没有包裹)。我知道dplyr
和data.table
都有更有效的方法
df %>%
group_by(Id) %>%
summarise(Count_Of_I_before_First_C = foo(Character))
# A tibble: 2 × 2
Id Count_Of_I_before_First_C
<dbl> <int>
1 1 2
2 2 1
library(data.table)
dt <- data.table(Id = c(1,1,1,1,2,2), Character = c('I', 'I', 'C', 'I', 'I', 'C'))
dt[, cnt.c := cumsum(Character == "C"), by = Id]
res <- dt[cnt.c == 0, .(Count_Of_I_before_First_C = length(Character)), by = Id]
library(dplyr)
rlei <- function(x) {
r <- rle(x)
I <- which(r$values=="I")
C <- which(r$values=="C")
r$lengths[which(I<C)][1]
}
group_by(df, Id) %>%
summarise(Count_Of_I_before_First_C=rlei(.$Character))