R 查找NA介于0和1之间的行
我想标识包含NA的行,并且这些行介于0和1之间。考虑这个数据。表:R 查找NA介于0和1之间的行,r,indexing,data.table,R,Indexing,Data.table,我想标识包含NA的行,并且这些行介于0和1之间。考虑这个数据。表: DT <- data.table(a = c(0, NA, NA, 0, NA, 1, 1, NA, 0, NA, 1, NA, NA, NA, 0, 1, 1, 0, NA, 0)) # DT # a # 1: 0 # 2: NA # 3: NA # 4: 0 # 5: NA # 6: 1 # 7: 1 # 8: NA # 9: 0 # 10: NA # 11: 1 # 12: NA # 13: NA #
DT <- data.table(a = c(0, NA, NA, 0, NA, 1, 1, NA, 0, NA, 1, NA, NA, NA, 0, 1, 1, 0, NA, 0))
# DT
# a
# 1: 0
# 2: NA
# 3: NA
# 4: 0
# 5: NA
# 6: 1
# 7: 1
# 8: NA
# 9: 0
# 10: NA
# 11: 1
# 12: NA
# 13: NA
# 14: NA
# 15: 0
# 16: 1
# 17: 1
# 18: 0
# 19: NA
# 20: 0
DT可以计算NA序列的开始,因此:
library("data.table")
DT <- data.table(a = c(0, NA, NA, 0, NA, 1, 1, NA, 0, NA, 1, NA, NA, NA, 0, 1, 1, 0, NA, 0))
r <- DT[, rle(is.na(a))]
R <- data.table(r$values, r$lengths, start=c(1, 1+head(cumsum(r$lengths), -1)))
i <- R[(V1), start]
j <- R[(V1), start+V2-1]
i[(DT[i-1, a] + DT[j+1, a])==1]
# result: [1] 5 8 10 12
库(“data.table”)
DT您可以尝试使用approx
DT[,b := approx((1:.N)[!is.na(a)],na.omit(a),1:.N)$y]
然后申请
DT[, which(is.na(a) & b>0 & b<1)]
给
[1] 5 8 10 12 13 14
zoo
包及其na.locf()
函数可以帮助您,如Dirk Eddelbuettel在此处所述:
这里的想法是,diff列中的每个“1”都告诉您,在下面的NAs之后,“a”中的值将发生变化
现在,您希望在“diff”列中删除NAs。为清楚起见,我们将结果放入新的列“b”。这就是zoo
软件包发挥作用的地方:
DT[, b:=na.locf(diff)]
这导致
a diff b
1: 0 0 0
2: NA NA 0
3: NA NA 0
4: 0 1 1
5: NA NA 1
6: 1 0 0
7: 1 1 1
8: NA NA 1
9: 0 1 1
10: NA NA 1
11: 1 1 1
12: NA NA 1
13: NA NA 1
14: NA NA 1
15: 0 1 1
16: 1 0 0
17: 1 1 1
18: 0 0 0
19: NA NA 0
20: 0 0 0
最终
DT[is.na(a) & b == 1, which = TRUE]
将为您提供:
[1] 5 8 10 12 13 14
谢谢你。我想你可以让省略is.na(DT[,a])&更简单一点,所以只有它(DT[,b]>0&DT[,b]
a diff b
1: 0 0 0
2: NA NA 0
3: NA NA 0
4: 0 1 1
5: NA NA 1
6: 1 0 0
7: 1 1 1
8: NA NA 1
9: 0 1 1
10: NA NA 1
11: 1 1 1
12: NA NA 1
13: NA NA 1
14: NA NA 1
15: 0 1 1
16: 1 0 0
17: 1 1 1
18: 0 0 0
19: NA NA 0
20: 0 0 0
DT[is.na(a) & b == 1, which = TRUE]
[1] 5 8 10 12 13 14