R 创建跟踪每个主题完成的周期数的列

R 创建跟踪每个主题完成的周期数的列,r,cumulative-sum,R,Cumulative Sum,我使用的是一个数据集,它统计受试者访问特定地点(或地点类型)的次数。当受试者访问某个位置(在我的示例中,我们称其为位置“X”)时,每个受试者的访问计数都会重置 library(dplyr) location <- c("A", "B", "X", "A", "C", "X", "A", "X", "C", "A", "B", "B", "A", "A", "X") group <- c(1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0) id

我使用的是一个数据集,它统计受试者访问特定地点(或地点类型)的次数。当受试者访问某个位置(在我的示例中,我们称其为位置“X”)时,每个受试者的访问计数都会重置

library(dplyr)
location <- c("A", "B", "X", "A", "C", "X", "A", "X", "C", "A", "B", "B", "A", "A", "X") 
group <- c(1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0)
id <- c(111, 111, 111, 111, 112, 112, 113, 113, 113, 113, 113, 114, 114, 114, 114)

df <- data.frame(id, group, location)

df <- within(df, {
    ## this produces a lot of warnings, but it achieves my desired result
    count = ave(id, group, cumsum(group == 0), id, FUN = seq)
    }) %>%
    mutate(count = ifelse(group == 0, yes = 0, no = count)) ## mark restarts

print(df)
     id location group count
 1  111        A     1     1
 2  111        B     1     2
 3  111        X     0     0
 4  111        A     1     1
 5  112        C     1     1
 6  112        X     0     0
 7  113        A     1     1
 8  113        X     0     0
 9  113        C     1     1
10  113        A     1     2
11  113        B     1     3
12  114        B     1     1
13  114        A     1     2
14  114        A     1     3
15  114        X     0     0
我有一个分组变量,可以帮助我在“X”和“非X”位置之间进行筛选,但我想跟踪每个主题出现的序列数

library(dplyr)
location <- c("A", "B", "X", "A", "C", "X", "A", "X", "C", "A", "B", "B", "A", "A", "X") 
group <- c(1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0)
id <- c(111, 111, 111, 111, 112, 112, 113, 113, 113, 113, 113, 114, 114, 114, 114)

df <- data.frame(id, group, location)

df <- within(df, {
    ## this produces a lot of warnings, but it achieves my desired result
    count = ave(id, group, cumsum(group == 0), id, FUN = seq)
    }) %>%
    mutate(count = ifelse(group == 0, yes = 0, no = count)) ## mark restarts

print(df)
     id location group count
 1  111        A     1     1
 2  111        B     1     2
 3  111        X     0     0
 4  111        A     1     1
 5  112        C     1     1
 6  112        X     0     0
 7  113        A     1     1
 8  113        X     0     0
 9  113        C     1     1
10  113        A     1     2
11  113        B     1     3
12  114        B     1     1
13  114        A     1     2
14  114        A     1     3
15  114        X     0     0
此函数返回我要查找的内容,但我不确定它是否能很好地扩展到实际数据:

trackCycle <- function(sequence) {
    cycle <- 1
    out <- c()
    for (i in 1:length(sequence)) {
        if(i != 1 & sequence[i] == 0) {
            cycle <- cycle + 1
            out <- c(out, 0)
        } else {
            out <- c(out, cycle)
        }
    }
    out
}

df %>%
   group_by(id) %>%
   mutate(cycle = trackCycle(count))

trackCycle评论中的简单解决方案:

df <- df %>%
    group_by(id) %>%
    mutate(cycle = 1 + cumsum(location == "X"))

df[df$location == "X", "cycle"] <- 0
df%
分组依据(id)%>%
突变(周期=1+cumsum(位置=“X”))

df[df$location==“X”,“cycle”]对于每个id,您可以使用
1+cumsum(location==“X”)
,并将location==“X”的索引替换为0。@nongkrong,感谢您的快速响应!你能在评论中详细说明吗?对不起,我不明白你的建议应该在哪里实施!没问题,我只是指以下
df%>%groupby(id)%>%mutate(cycle=1+cumsum(location=X))
。然后,
df[df$location==“X”,“cycle”]@nongkrong-Aha!非常感谢,那很容易。我将编辑我的帖子并将您的解决方案放在那里。