R 查找NA介于0和1之间的行

R 查找NA介于0和1之间的行,r,indexing,data.table,R,Indexing,Data.table,我想标识包含NA的行,并且这些行介于0和1之间。考虑这个数据。表: DT <- data.table(a = c(0, NA, NA, 0, NA, 1, 1, NA, 0, NA, 1, NA, NA, NA, 0, 1, 1, 0, NA, 0)) # DT # a # 1: 0 # 2: NA # 3: NA # 4: 0 # 5: NA # 6: 1 # 7: 1 # 8: NA # 9: 0 # 10: NA # 11: 1 # 12: NA # 13: NA #

我想标识包含NA的行,并且这些行介于0和1之间。考虑这个数据。表:

DT <- data.table(a = c(0, NA, NA, 0, NA, 1, 1, NA, 0, NA, 1, NA, NA, NA, 0, 1, 1, 0, NA, 0))

# DT
# a
# 1:  0
# 2: NA
# 3: NA
# 4:  0
# 5: NA
# 6:  1
# 7:  1
# 8: NA
# 9:  0
# 10: NA
# 11:  1
# 12: NA
# 13: NA
# 14: NA
# 15:  0
# 16:  1
# 17:  1
# 18:  0
# 19: NA
# 20:  0

DT可以计算NA序列的开始,因此:

library("data.table")
DT <- data.table(a = c(0, NA, NA, 0, NA, 1, 1, NA, 0, NA, 1, NA, NA, NA, 0, 1, 1, 0, NA, 0))

r <- DT[, rle(is.na(a))]
R <- data.table(r$values, r$lengths, start=c(1, 1+head(cumsum(r$lengths), -1)))

i <- R[(V1), start]
j <- R[(V1), start+V2-1]
i[(DT[i-1, a] + DT[j+1, a])==1]
# result: [1]  5  8 10 12
库(“data.table”)

DT您可以尝试使用
approx

DT[,b := approx((1:.N)[!is.na(a)],na.omit(a),1:.N)$y]
然后申请

DT[, which(is.na(a) & b>0 & b<1)]

[1]  5  8 10 12 13 14

zoo
包及其
na.locf()
函数可以帮助您,如Dirk Eddelbuettel在此处所述:

这里的想法是,diff列中的每个“1”都告诉您,在下面的NAs之后,“a”中的值将发生变化

现在,您希望在“diff”列中删除NAs。为清楚起见,我们将结果放入新的列“b”。这就是
zoo
软件包发挥作用的地方:

DT[, b:=na.locf(diff)]
这导致

     a diff b
 1:  0    0 0
 2: NA   NA 0
 3: NA   NA 0
 4:  0    1 1
 5: NA   NA 1
 6:  1    0 0
 7:  1    1 1
 8: NA   NA 1
 9:  0    1 1
10: NA   NA 1
11:  1    1 1
12: NA   NA 1
13: NA   NA 1
14: NA   NA 1
15:  0    1 1
16:  1    0 0
17:  1    1 1
18:  0    0 0
19: NA   NA 0
20:  0    0 0
最终

DT[is.na(a) & b == 1, which = TRUE]
将为您提供:

[1]  5  8 10 12 13 14

谢谢你。我想你可以让省略is.na(DT[,a])&更简单一点,所以只有它(DT[,b]>0&DT[,b]
     a diff b
 1:  0    0 0
 2: NA   NA 0
 3: NA   NA 0
 4:  0    1 1
 5: NA   NA 1
 6:  1    0 0
 7:  1    1 1
 8: NA   NA 1
 9:  0    1 1
10: NA   NA 1
11:  1    1 1
12: NA   NA 1
13: NA   NA 1
14: NA   NA 1
15:  0    1 1
16:  1    0 0
17:  1    1 1
18:  0    0 0
19: NA   NA 0
20:  0    0 0
DT[is.na(a) & b == 1, which = TRUE]
[1]  5  8 10 12 13 14