R 创建跟踪每个主题完成的周期数的列
我使用的是一个数据集,它统计受试者访问特定地点(或地点类型)的次数。当受试者访问某个位置(在我的示例中,我们称其为位置“X”)时,每个受试者的访问计数都会重置R 创建跟踪每个主题完成的周期数的列,r,cumulative-sum,R,Cumulative Sum,我使用的是一个数据集,它统计受试者访问特定地点(或地点类型)的次数。当受试者访问某个位置(在我的示例中,我们称其为位置“X”)时,每个受试者的访问计数都会重置 library(dplyr) location <- c("A", "B", "X", "A", "C", "X", "A", "X", "C", "A", "B", "B", "A", "A", "X") group <- c(1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0) id
library(dplyr)
location <- c("A", "B", "X", "A", "C", "X", "A", "X", "C", "A", "B", "B", "A", "A", "X")
group <- c(1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0)
id <- c(111, 111, 111, 111, 112, 112, 113, 113, 113, 113, 113, 114, 114, 114, 114)
df <- data.frame(id, group, location)
df <- within(df, {
## this produces a lot of warnings, but it achieves my desired result
count = ave(id, group, cumsum(group == 0), id, FUN = seq)
}) %>%
mutate(count = ifelse(group == 0, yes = 0, no = count)) ## mark restarts
print(df)
id location group count
1 111 A 1 1
2 111 B 1 2
3 111 X 0 0
4 111 A 1 1
5 112 C 1 1
6 112 X 0 0
7 113 A 1 1
8 113 X 0 0
9 113 C 1 1
10 113 A 1 2
11 113 B 1 3
12 114 B 1 1
13 114 A 1 2
14 114 A 1 3
15 114 X 0 0
我有一个分组变量,可以帮助我在“X”和“非X”位置之间进行筛选,但我想跟踪每个主题出现的序列数
library(dplyr)
location <- c("A", "B", "X", "A", "C", "X", "A", "X", "C", "A", "B", "B", "A", "A", "X")
group <- c(1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0)
id <- c(111, 111, 111, 111, 112, 112, 113, 113, 113, 113, 113, 114, 114, 114, 114)
df <- data.frame(id, group, location)
df <- within(df, {
## this produces a lot of warnings, but it achieves my desired result
count = ave(id, group, cumsum(group == 0), id, FUN = seq)
}) %>%
mutate(count = ifelse(group == 0, yes = 0, no = count)) ## mark restarts
print(df)
id location group count
1 111 A 1 1
2 111 B 1 2
3 111 X 0 0
4 111 A 1 1
5 112 C 1 1
6 112 X 0 0
7 113 A 1 1
8 113 X 0 0
9 113 C 1 1
10 113 A 1 2
11 113 B 1 3
12 114 B 1 1
13 114 A 1 2
14 114 A 1 3
15 114 X 0 0
此函数返回我要查找的内容,但我不确定它是否能很好地扩展到实际数据:
trackCycle <- function(sequence) {
cycle <- 1
out <- c()
for (i in 1:length(sequence)) {
if(i != 1 & sequence[i] == 0) {
cycle <- cycle + 1
out <- c(out, 0)
} else {
out <- c(out, cycle)
}
}
out
}
df %>%
group_by(id) %>%
mutate(cycle = trackCycle(count))
trackCycle评论中的简单解决方案:
df <- df %>%
group_by(id) %>%
mutate(cycle = 1 + cumsum(location == "X"))
df[df$location == "X", "cycle"] <- 0
df%
分组依据(id)%>%
突变(周期=1+cumsum(位置=“X”))
df[df$location==“X”,“cycle”]对于每个id,您可以使用1+cumsum(location==“X”)
,并将location==“X”的索引替换为0。@nongkrong,感谢您的快速响应!你能在评论中详细说明吗?对不起,我不明白你的建议应该在哪里实施!没问题,我只是指以下df%>%groupby(id)%>%mutate(cycle=1+cumsum(location=X))
。然后,df[df$location==“X”,“cycle”]@nongkrong-Aha!非常感谢,那很容易。我将编辑我的帖子并将您的解决方案放在那里。