R 采样后如何查找向量中的剩余项？_R_Vector_Random Sample

R 采样后如何查找向量中的剩余项？

r vector

R 采样后如何查找向量中的剩余项？,r,vector,random-sample,R,Vector,Random Sample,我创建了一个向量，如下所示 Expenditure [1] 13.9 15.4 15.8 17.9 18.3 19.9 20.6 21.4 21.7 23.1 [11] 20.0 20.6 24.0 25.1 26.2 30.0 30.6 30.9 33.8 44.1 现在我从中随机抽取了10个样本现在，我想在创建了名为ransomsample的示例后，查找支出中的剩余项目。我可以使用的任何现有功能？这应该可以： #generate 20 random numbers x <- rn

我创建了一个向量，如下所示

Expenditure
 [1] 13.9 15.4 15.8 17.9 18.3 19.9 20.6 21.4 21.7 23.1
[11] 20.0 20.6 24.0 25.1 26.2 30.0 30.6 30.9 33.8 44.1

现在我从中随机抽取了10个样本

现在，我想在创建了名为ransomsample的示例后，查找支出中的剩余项目。我可以使用的任何现有功能？

这应该可以：

#generate 20 random numbers
x <- rnorm(20)
#sample 10 of them
randomSample <- sample(x, 10, replace = FALSE)

#we can get the ones we sampled with:
x[x %in% randomSample]

#Let's confirm this. NOTE - added sort() to easily see they do match
cbind(sort(randomSample), sort(x[x %in% randomSample]))

#So we want to negate the above
x[!(x %in% randomSample)]

实现这一点的方法取决于您需要如何处理采样向量中的复制。如果您可以确定没有重复，那么@Chase使用x[！x%in%randomSample]给出的简单方法是完美的。但是，如果存在潜在的重复，那么就需要更加小心。我们可以从以下几点中清楚地看到这一点：

# Start with a vector (length=9) replete with replicates
x <- rep(letters[1:3],3)

# Now sample 8 of its 9 values (leaving one unsampled)
set.seed(123)
randomSample <- sample(x, 8, replace = FALSE)

# try using simple method to find which value remains after sampling
x[!(x %in% randomSample)]
## character(0)

是否尝试解决此问题？请查看%Eventuall中的“帮助%”。您可以通过构造逻辑索引向量来进行采样。我注意到，如果数据中存在重复项，这可能无法提供所需的行为。这适用于OP中没有重复项的示例，但需要注意对具有重复值的数据使用此方法。此方法将删除x中与示例中的值匹配的所有值，这些值可能是您想要的，也可能不是您想要的。

# Start with a vector (length=9) replete with replicates
x <- rep(letters[1:3],3)

# Now sample 8 of its 9 values (leaving one unsampled)
set.seed(123)
randomSample <- sample(x, 8, replace = FALSE)

# try using simple method to find which value remains after sampling
x[!(x %in% randomSample)]
## character(0)

xtab <- as.data.frame(table(x))
stab <- as.data.frame(table(randomSample))
xtab[which(xtab$x %in% stab$randomSample),]$Freq <- 
  xtab[which(xtab$x %in% stab$randomSample),]$Freq - stab$Freq
rep(xtab$x, xtab$Freq)
## [1] a