在R中查找字符串中顺序重复项的索引
我有一个字符串,已转换为字符向量:在R中查找字符串中顺序重复项的索引,r,string,R,String,我有一个字符串,已转换为字符向量: string <- c("A","A","A","C","G","G","C","C","T","T","T","T") 我试着研究了str\u locate和其他一些str函数,但没有找到答案。感谢您的帮助 我们可以使用按“string”的游程长度id将拆分为列表,获得值的范围,以及rbind列表元素 rl <- rle(string) lst <- lapply(split(seq_along(string), rep(seq_alon
string <- c("A","A","A","C","G","G","C","C","T","T","T","T")
我试着研究了
str\u locate
和其他一些str函数,但没有找到答案。感谢您的帮助 我们可以使用按“string”的游程长度id将拆分为列表
,获得值的范围
,以及rbind
列表
元素
rl <- rle(string)
lst <- lapply(split(seq_along(string), rep(seq_along(rl$values), rl$lengths)), range)
names(lst) <- r1$values
do.call(rbind, lst)
# [,1] [,2]
#A 1 3
#C 4 4
#G 5 6
#C 7 8
#T 9 12
或使用tidyverse
library(tidyverse)
library(data.table)
string %>%
tibble(letter = .) %>%
mutate(rn = row_number()) %>%
group_by(grp = rleid(letter)) %>%
summarise(letter = first(letter),
start = first(rn),
end = last(rn)) %>%
ungroup %>%
select(-grp)
我将在rle
s=rle(string)
v=cumsum(rle(string)$lengths)
data.frame('var'=s$values,'start'=v+1-s$lengths,'end'=v)
var start end
1 A 1 3
2 C 4 4
3 G 5 6
4 C 7 8
5 T 9 12
library(tidyverse)
library(data.table)
string %>%
tibble(letter = .) %>%
mutate(rn = row_number()) %>%
group_by(grp = rleid(letter)) %>%
summarise(letter = first(letter),
start = first(rn),
end = last(rn)) %>%
ungroup %>%
select(-grp)
s=rle(string)
v=cumsum(rle(string)$lengths)
data.frame('var'=s$values,'start'=v+1-s$lengths,'end'=v)
var start end
1 A 1 3
2 C 4 4
3 G 5 6
4 C 7 8
5 T 9 12