用R中的第二个最小值组替换异常值

用R中的第二个最小值组替换异常值,r,data.table,outliers,R,Data.table,Outliers,我是R新手,我有一个数据表dt > library(data.table) > dt <- data.table(A = c(1,2,3,4,74,6, 7, 8, 9, 75, 11, 12), + B=c("P","P","P","P", "P", "P" ,"Q","Q","Q", "Q", "Q", "Q"), + C=c("a","b","c","d","e","f", "g", "h", "i"

我是R新手,我有一个数据表dt

> library(data.table)
> dt <- data.table(A = c(1,2,3,4,74,6, 7, 8, 9, 75, 11, 12), 
+                  B=c("P","P","P","P", "P", "P" ,"Q","Q","Q", "Q", "Q", "Q"), 
+                  C=c("a","b","c","d","e","f", "g", "h", "i", "j", "k", "l"))
> dt
     A B C
 1:  1 P a
 2:  2 P b
 3:  3 P c
 4:  4 P d
 5: 74 P e
 6:  6 P f
 7:  7 Q g
 8:  8 Q h
 9:  9 Q i
 10: 75 Q j
 11: 11 Q k
 12: 12 Q l

我们可以用
替换

dt[, A := replace(A, out ==1, sort(A)[2]) , by = B]
dt
#     A B C out
# 1:  1 P a   0
# 2:  2 P b   0
# 3:  3 P c   0
# 4:  4 P d   0
# 5:  2 P e   1
# 6:  6 P f   0
# 7:  7 Q g   0
# 8:  8 Q h   0
# 9:  9 Q i   0
#10:  8 Q j   1
#11: 11 Q k   0
#12: 12 Q l   0

或者另一种选择是

dt[, A := pmax((out==1)*sort(A)[2], (out==0)*A), B]

?sort
中似乎有一个部分排序选项,在这种情况下可能值得使用。
dt[, A := replace(A, out ==1, sort(A)[2]) , by = B]
dt
#     A B C out
# 1:  1 P a   0
# 2:  2 P b   0
# 3:  3 P c   0
# 4:  4 P d   0
# 5:  2 P e   1
# 6:  6 P f   0
# 7:  7 Q g   0
# 8:  8 Q h   0
# 9:  9 Q i   0
#10:  8 Q j   1
#11: 11 Q k   0
#12: 12 Q l   0
dt[, A := pmax((out==1)*sort(A)[2], (out==0)*A), B]