计算一个元素在序列中重复或不重复的次数（R）_R_Sequence

计算一个元素在序列中重复或不重复的次数（R）

计算一个元素在序列中重复或不重复的次数（R）,r,sequence,R,Sequence,我有一个事件序列，编码为a、B和C。对于每个元素，我需要计算该元素之前重复了多少次，但如果不重复，则每行计数器应减少一次。在第一次遇到每件物品时，它的计数器为零。例如： x<-c('A','A','A','B','C','C','A','B','A','C') y<-c(0,1,2,0,0,1,-2,-4,-4,-3) cbind(x,y) x y [1,] "A" "0" [2,] "A" "1" [3,] "A" "2" [4,] "B" "

我有一个事件序列，编码为a、B和C。对于每个元素，我需要计算该元素之前重复了多少次，但如果不重复，则每行计数器应减少一次。在第一次遇到每件物品时，它的计数器为零。例如：

x<-c('A','A','A','B','C','C','A','B','A','C')
y<-c(0,1,2,0,0,1,-2,-4,-4,-3)
cbind(x,y)

      x   y   
 [1,] "A" "0" 
 [2,] "A" "1" 
 [3,] "A" "2" 
 [4,] "B" "0" 
 [5,] "C" "0" 
 [6,] "C" "1" 
 [7,] "A" "-2"
 [8,] "B" "-4"
 [9,] "A" "-4"
[10,] "C" "-3"

x我认为这是一种解决问题的R
方法。我们可以以相同的方式计算所有不同元素的索引，用其初始位置将其偏移，然后将它们组合在一起
分别计算x中每个唯一元素的索引：
library(data.table)
sepIndex <- lapply(unique(x), function(i) { 
    s = cumsum(ifelse(duplicated(rleid(x == i)) & x == i, 1, -1)) + min(which(x == i)); 
    # use `rleid` with `duplicated` to find out the duplicated elements in each block.
    # and assign `1` to each duplicated element and `-1` otherwise and use cumsum for cumulative index
    # offset the index by the initial position of the element `min(which(x == i))`
    replace(s, x != i, NA) 
})

使用Reduce
功能将列表合并为一个列表，您将获得所需的：
Reduce(function(x, y) ifelse(is.na(x), y, x), sepIndex)
#  [1]  0  1  2  0  0  1 -2 -4 -4 -3

还有另一种使用base R的方法
positions <- sapply(unique(x),function(t) which(x %in% t))
values <- sapply(sapply(positions,diff),function(s) c(0,cumsum(ifelse(s>1,-s,s))))
df <- data.frame(positions=unlist(positions),values=unlist(values))
df[with(df,order(positions)),2]

对不起，小错误，第7行的值应该是-2。在第7行event='A'上，“A”计数器的上一个值是2（第3行），因此在第4行counter=1、第5行counter=0、第6行counter=-1、第7行counter=-2。B也是如此-B的最后一个计数器值为0，自上一个B以来共有4行。如果当前事件与上一行相同，则计数器增加1，如果当前事件与上一行相同，则计数器减少1，并且每个事件类型都有单独的计数器。还修复了第9行上的A值，以及错误。这就是我数东西时发生的事情。
positions <- sapply(unique(x),function(t) which(x %in% t))
values <- sapply(sapply(positions,diff),function(s) c(0,cumsum(ifelse(s>1,-s,s))))
df <- data.frame(positions=unlist(positions),values=unlist(values))
df[with(df,order(positions)),2]