R 从最大值样本中重新采样
我有以下问题:R 从最大值样本中重新采样,r,statistics,resampling,R,Statistics,Resampling,我有以下问题: 我有4个袋子,每个袋子有20个值,我从4个袋子中随机抽取10个样本: for (i in 1:20){ bag1[i] = sample(0:50,1) bag2[i] = sample(0:50,1) bag3[i] = sample(0:50,1) bag4[i] = sample(0:50,1) } for (j in 1:10){ samp=sample(1:20,1) bag1value=bag1value+bag1[samp]
我有4个袋子,每个袋子有20个值,我从4个袋子中随机抽取10个样本:
for (i in 1:20){
bag1[i] = sample(0:50,1)
bag2[i] = sample(0:50,1)
bag3[i] = sample(0:50,1)
bag4[i] = sample(0:50,1)
}
for (j in 1:10){
samp=sample(1:20,1)
bag1value=bag1value+bag1[samp]
bag2value=bag2value+bag2[samp]
bag3value=bag3value+bag3[samp]
bag4value=bag4value+bag4[samp]
}
现在,我想从第一个样本中具有最大值的袋子中再次取样10个值。所以我可以做到:
maxbag=max(bag1value,bag2value,bag3value,bag4value)
if (maxbag==bag1value){
for (j1 in 1:10){
samp=sample(1:20,1)
secondsample=secondsample+bag1[samp]
} elseif (maxbag==bag2value){
samp=sample(1:20,1)
secondsample=secondsample+bag2[samp]
}
但我正在寻找一种更优雅的方式来做到这一点 您的代码目前不起作用。导出bag值和secondsample的两个for循环中不存在参数j和j1 无论如何,处理数据的更优雅的方法是使用列表或数组。第一个循环可以替换为下面的数组“bags”,列1:4表示bags 1到4:
bags<-sapply(1:4, function(x) sample(1:50, 20, replace=T))
colnames(bags) <- paste0("bag", 1:4)
head(bags)
bag1 bag2 bag3 bag4
[1,] 7 1 14 16
[2,] 50 23 49 7
[3,] 14 48 26 10
[4,] 42 11 8 10
[5,] 31 43 11 9
[6,] 5 20 27 19
非常感谢你!但是,当我取样时,如何做到这一点,我对每个袋子使用相同的“行”?例如,我决定对第[3]行进行采样,新样本的输出将是(14,48,26,10)。您可以使用
sample(bags[,“bag1”]、10)来选择“bag1”
如果选择第二个样本时出现“tie”,则此代码可能有问题。为了避免这个问题,我认为我们应该写:secondsample
new <- sapply(colnames(bags), function(x)sample(bags[,x], 10, replace=F))
head(new)
bag1 bag2 bag3 bag4
[1,] 14 1 49 2
[2,] 31 26 13 18
[3,] 1 48 14 9
[4,] 38 23 27 6
[5,] 24 23 26 10
[6,] 14 42 8 29
max.new <- sapply(1:4, function(x) max(new[,x]))
max.new
[1] 38 48 49 29
max.bag <- colnames(bags)[max.new==max(max.new)]
secondsample <- sample(bags[,max.bag], 10)
secondsample
[1] 8 13 27 14 31 13 49 29 38 5