R 如何修改所选行的n following值
我想将1后面的3个值替换为1,并将所有剩余的NAs替换为0 我的数据(可复制示例) 我想要什么 到目前为止我在哪里 我正在努力避免在R中出现一个巨大的for循环。我知道这不是正确的方向,但我不知道还能做什么R 如何修改所选行的n following值,r,dataframe,dplyr,tidyr,R,Dataframe,Dplyr,Tidyr,我想将1后面的3个值替换为1,并将所有剩余的NAs替换为0 我的数据(可复制示例) 我想要什么 到目前为止我在哪里 我正在努力避免在R中出现一个巨大的for循环。我知道这不是正确的方向,但我不知道还能做什么 get_all_the_ones <- function(col) { for (ro in 1:nrow(datadf[col])) { if (datadf[ro,col] == 1) { datadf[seq(ro, ro+3), c
get_all_the_ones <- function(col) {
for (ro in 1:nrow(datadf[col])) {
if (datadf[ro,col] == 1) {
datadf[seq(ro, ro+3), col] = 1
}
}
}
get_all_the_one这可以通过编写这样的帮助函数轻松完成
my_fun <- function(x){
require(dplyr)
cond <- (x == 1) | (lag(x, 1) == 1) | (lag(x, 2) == 1) | (lag(x, 3) == 1)
new_values <- if_else(cond == TRUE, 1, 0, missing = 0)
return(new_values)
}
您需要lag
功能。试试这个:
library(dplyr)
x <- datadf$typeA
a <- (is.na(x) & lag(x,1)==1) |
(is.na(x) & is.na(lag(x)) & lag(x,2)==1) |
(is.na(x) & is.na(lag(x)) & is.na(lag(x,2)) & lag(x,3)==1)
a[is.na(a)] <- FALSE
datadf$typeA[a] <- 1
datadf$typeA[is.na(datadf$typeA)] <- 0
x <- datadf$typeB
a <- (is.na(x) & lag(x,1)==1) |
(is.na(x) & is.na(lag(x)) & lag(x,2)==1) |
(is.na(x) & is.na(lag(x)) & is.na(lag(x,2)) & lag(x,3)==1)
a[is.na(a)] <- FALSE
datadf$typeB[a] <- 1
datadf$typeB[is.na(datadf$typeB)] <- 0
此代码只替换NA
值。非常感谢,它可以完美地工作,只需最少的代码即可理解。这里的诀窍是创建一个条件向量,明白了。谢谢你,你的解决方案对于基R来说非常好。我选择了dplyr,因为它可以更好地扩展我的真实数据帧。我感谢你的努力。
get_all_the_ones <- function(col) {
for (ro in 1:nrow(datadf[col])) {
if (datadf[ro,col] == 1) {
datadf[seq(ro, ro+3), col] = 1
}
}
}
my_fun <- function(x){
require(dplyr)
cond <- (x == 1) | (lag(x, 1) == 1) | (lag(x, 2) == 1) | (lag(x, 3) == 1)
new_values <- if_else(cond == TRUE, 1, 0, missing = 0)
return(new_values)
}
library(dplyr)
datadf %>%
mutate(across(starts_with("type"), ~my_fun(.)))
# seq_time typeA typeB
# 1 2020-09-01 04:30:00 0 0
# 2 2020-09-01 04:30:01 0 0
# 3 2020-09-01 04:30:02 0 0
# 4 2020-09-01 04:30:03 0 1
# 5 2020-09-01 04:30:04 0 1
# 6 2020-09-01 04:30:05 1 1
# 7 2020-09-01 04:30:06 1 1
# 8 2020-09-01 04:30:07 1 1
# 9 2020-09-01 04:30:08 1 0
# 10 2020-09-01 04:30:09 0 0
# 11 2020-09-01 04:30:10 0 0
# 12 2020-09-01 04:30:11 0 0
library(dplyr)
x <- datadf$typeA
a <- (is.na(x) & lag(x,1)==1) |
(is.na(x) & is.na(lag(x)) & lag(x,2)==1) |
(is.na(x) & is.na(lag(x)) & is.na(lag(x,2)) & lag(x,3)==1)
a[is.na(a)] <- FALSE
datadf$typeA[a] <- 1
datadf$typeA[is.na(datadf$typeA)] <- 0
x <- datadf$typeB
a <- (is.na(x) & lag(x,1)==1) |
(is.na(x) & is.na(lag(x)) & lag(x,2)==1) |
(is.na(x) & is.na(lag(x)) & is.na(lag(x,2)) & lag(x,3)==1)
a[is.na(a)] <- FALSE
datadf$typeB[a] <- 1
datadf$typeB[is.na(datadf$typeB)] <- 0
# seq_time typeA typeB
# 1 2020-09-01 04:30:00 0 0
# 2 2020-09-01 04:30:01 0 0
# 3 2020-09-01 04:30:02 0 0
# 4 2020-09-01 04:30:03 0 1
# 5 2020-09-01 04:30:04 0 1
# 6 2020-09-01 04:30:05 1 1
# 7 2020-09-01 04:30:06 1 1
# 8 2020-09-01 04:30:07 1 1
# 9 2020-09-01 04:30:08 1 0
# 10 2020-09-01 04:30:09 0 0
# 11 2020-09-01 04:30:10 0 0
# 12 2020-09-01 04:30:11 0 0