R 每行最多可随机删除3个元素
我想在包含五列的数据集中,每行随机删除最多三个元素。下面是R 每行最多可随机删除3个元素,r,R,我想在包含五列的数据集中,每行随机删除最多三个元素。下面是R代码,我认为可以这样做,但它最多允许删除一行中的所有五个元素。这似乎很基本,但我找不到错误。谢谢你的建议 set.seed(1234) # create matrix to contain flags identifying elements to be deleted delete.these <- matrix(0, nrow=10, ncol=5) for(i in 1:nrow(delete.these))
R
代码,我认为可以这样做,但它最多允许删除一行中的所有五个元素。这似乎很基本,但我找不到错误。谢谢你的建议
set.seed(1234)
# create matrix to contain flags identifying elements to be deleted
delete.these <- matrix(0, nrow=10, ncol=5)
for(i in 1:nrow(delete.these)) {
# for each row randomly select the order of the columns
# to be tested for deletion
rcols <- sample(5, 5, replace = FALSE)
for(j in 1:ncol(delete.these)) {
# select a random draw
delete.it <- runif(1,0,1)
# if random draw is below specified threshold and fewer than three
# elements have already been deleted from the row then delete element
if((delete.it <= 0.7) & sum(delete.these[i,1:5] <= 2)) { delete.these[i,rcols[j]] = 1}
if((delete.it > 0.7) | sum(delete.these[i,1:5] >= 3)) { delete.these[i,rcols[j]] = 0}
}
}
delete.these
set.seed(1234)
#创建包含标识要删除元素的标志的矩阵
删除。这些而不是使用runif()
尝试直接绘制索引
delete.these <- matrix(0, nrow=10, ncol=5)
for (i in 1:NROW(delete.these)){
delete.these[i,sample.int(5,sample.int(4,1)-1)] <- 1
}
delete.these
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 0 0
[2,] 0 0 0 0 0
[3,] 0 1 0 1 1
[4,] 0 1 1 0 1
[5,] 1 0 1 0 0
[6,] 0 0 0 0 0
[7,] 1 0 1 0 0
[8,] 0 1 0 1 1
[9,] 0 1 1 0 0
[10,] 1 0 1 0 1
delete.this使用两列矩阵作为[谢谢,但我不希望每行删除三个元素。我希望每行最多删除三个元素。Oops。好的,将代码更新为最多选择3个。仍然不需要runif()
谢谢!这是漫长的一天。我还需要删除原始帖子中的第二条if
语句。它会覆盖第一条if
语句的输出。在您的示例中,每行有三个删除的元素。我希望每行最多删除三个元素。一行可以删除0、1、2或3个元素。
sum(delete.these[i,1:5] <= 2)
sum(delete.these[i,1:5]) <= 2
dfrm <- data.frame(a1=rnorm(20), a2=rnorm(20),a3=rnorm(20),
a4=rnorm(20),a5=rnorm(20))
dfrm[ matrix( c( rep(1:20,each=3),
replicate(20, {sample(5, 3)} ) ), ncol=2) ] <- NA
> dfrm
a1 a2 a3 a4 a5
1 NA 0.70871541 NA NA -0.6922827
2 1.9846227 1.70592512 NA NA NA
3 0.2684487 NA 0.0008968694 NA NA
4 NA NA 0.5546355410 0.07399188 NA
5 NA 0.82324761 -0.0410918599 NA NA
6 NA NA -1.0715205164 NA -0.1683819
7 0.0933059 NA NA NA 1.3129301
8 NA 0.79382695 0.1877369725 NA NA
9 0.3124101 NA NA -1.22087347 NA
10 -0.1657043 NA NA 1.36626832 NA
11 NA -0.06095247 -0.9622792102 NA NA
12 NA -1.29243386 -1.2133819819 NA NA
13 -0.0886702 NA NA 0.37495775 NA
14 1.0812527 -1.54215156 NA NA NA
15 NA -0.24765627 NA 0.81374405 NA
16 NA 0.21307051 NA NA -0.6825013
17 -0.4129100 NA NA NA -0.9844177
18 NA 1.95881167 0.7977172969 NA NA
19 NA NA 0.0953287645 NA 1.7067591
20 NA NA -0.1057690912 0.73408897 NA
idx <- sapply(1:20, function(x) {n<- sample(1:5, sample(1:3,1))
matrix( c(rep(x,length(n)), n), ncol=2) }) # list
idx <- do.call(rbind, idx) # now a 2 col matrix
dfrm[ idx] <- NA
> idx <- sapply(1:20, function(x) {n<- sample(1:5, sample(1:3,1))
+ matrix( c(rep(x,length(n)), n), ncol=2) }) # list
> idx <- do.call(rbind, idx) # now a 2 col matrix
>
> dfrm[ idx] <- NA
>
> dfrm
a1 a2 a3 a4 a5
1 -0.048776740 NA 1.1879195 -0.23142932 -3.6185891
2 NA 0.4613289 -0.4532400 -0.85891682 -2.2034714
3 NA NA 1.1191833 1.12545821 NA
4 0.646399767 -0.7126735 2.9474470 0.36358070 NA
5 -0.630929314 1.3770828 NA NA 1.3987857
6 NA NA NA 1.06680025 0.4445383
7 0.484728630 NA 0.7382064 NA 0.9838159
8 -1.558031074 1.1630888 NA NA NA
9 -0.968887379 -0.7330051 NA 0.04621124 -0.9785049
10 0.935436533 NA NA -1.07365274 NA
11 NA 0.2529093 NA -1.38643245 -1.3389529
12 NA -0.2639166 -0.2301257 NA NA
13 2.026646586 -0.2452684 NA -0.30346521 NA
14 0.522717033 NA NA 1.25870278 NA
15 NA NA -0.9934046 -0.89009964 -0.8403772
16 NA NA 0.0987765 -0.98608109 1.4646301
17 NA 0.7693064 -0.9326388 -0.16240266 NA
18 -0.005393965 NA NA NA -0.8111057
19 NA 1.6241122 -1.1376916 0.15812435 NA
20 NA NA NA 0.71059666 0.5170046