如何计算一个值在R中保持不变的持续时间?
我有一个矩阵如下:如何计算一个值在R中保持不变的持续时间?,r,R,我有一个矩阵如下: TIME YORN [1,] 24 0 [2,] 26 0 [3,] 28 0 [4,] 30 1 [5,] 32 0 [6,] 34 1 [7,] 36 0 [8,] 38 0 [9,] 40 0 [10,] 42 0 [11,] 44 1 [12,] 45 0 [13,] 48 1 [14,] 50 1 [15,] 53 1 [16,] 54
TIME YORN
[1,] 24 0
[2,] 26 0
[3,] 28 0
[4,] 30 1
[5,] 32 0
[6,] 34 1
[7,] 36 0
[8,] 38 0
[9,] 40 0
[10,] 42 0
[11,] 44 1
[12,] 45 0
[13,] 48 1
[14,] 50 1
[15,] 53 1
[16,] 54 1
[17,] 56 1
[18,] 58 0
[19,] 60 1
[20,] 62 0
[21,] 64 1
[22,] 67 1
[23,] 68 1
[24,] 70 1
[25,] 72 1
[26,] 74 1
[27,] 89 1
我想计算“YORN”值连续多次保持为1的“TIME”的总持续时间(而不是立即变为0)
如何在R中实现这一点?这里有一个可能的
rle
解决方案(我不知道如何简化),假设dat
是您的矩阵
temp <- rle(dat[, 2] == 1L) # capture sequences of 1
temp$values[temp$lengths == 1L] <- FALSE # set all the smaller than 2 sequences to FALSE
indx <- inverse.rle(temp) # reverse back to the original vector size but with correct indexes
indx2 <- cumsum(c(1L, diff(which(indx == 1L))) > 1L) # separate to groups
sum(tapply(dat[indx, 1], indx2, function(x) diff(range(x)))) # sum the differences
## [1] 33
temp如果正确结果为33:
m <- structure(c(24L, 26L, 28L, 30L, 32L, 34L, 36L, 38L, 40L, 42L,
44L, 45L, 48L, 50L, 53L, 54L, 56L, 58L, 60L, 62L, 64L, 67L, 68L,
70L, 72L, 74L, 89L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 1L,
0L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L
), .Dim = c(27L, 2L), .Dimnames = list(NULL, c("TIME", "YORN"
)))
start <- c(m[1,2] == 1L, diff(m[,2]) == 1L)
end <- c(diff(m[,2]) == -1L, m[nrow(m),2] == 1L)
sum(m[end, 1] - m[start, 1])
#[1] 33
mdplyr解决方案:
df1 %>%
mutate(
changed = !is.na(lag(YORN)) & YORN != lag(YORN)) %>%
group_by(cumsum(changed), YORN) %>%
filter(min(TIME) != max(TIME) & YORN == 1) %>%
summarize(TOTAL = sum(TIME - lag(TIME), na.rm = TRUE )) %>%
ungroup() %>%
summarize(TOTAL = sum(TOTAL))
总持续时间的确切含义是什么?你想让最后一个时间值减去每个区块中的第一个吗?@Davidernburg:不是最后一个值减去第一个值,而是区块中每两个时间值之间的差值,然后是所有差值的总和。。为什么会有不同,是因为块中的时间值不是按顺序排列的。所以,所有这些差异的总和就是我最后想要计算的。你可能应该对提供的答案提供一些反馈。