Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/74.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R 具有替换的数据子集_R - Fatal编程技术网

R 具有替换的数据子集

R 具有替换的数据子集,r,R,我试图用替换从数据中抽取一个子集,这里我展示了一个简单的示例,如下所示: dat <- data.frame ( group = c(1,1,2,2,2,3,3,4,4,4,4,5,5), var = c(0.1,0.0,0.3,0.4,0.8,0.5,0.2,0.3,0.7,0.9,0.2,0.4,0.6) ) dat选择您喜欢的n: n <- 5 n这里有一个相当简单的解决方案,它使用make.unique()在newdat中创建组名: ## Your data

我试图用替换从数据中抽取一个子集,这里我展示了一个简单的示例,如下所示:

dat <- data.frame (
  group = c(1,1,2,2,2,3,3,4,4,4,4,5,5), 
  var = c(0.1,0.0,0.3,0.4,0.8,0.5,0.2,0.3,0.7,0.9,0.2,0.4,0.6)
) 

dat选择您喜欢的
n

n <- 5 

n这里有一个相当简单的解决方案,它使用
make.unique()
newdat
中创建组名:

## Your data
dat <- data.frame (
  group = c(1,1,2,2,2,3,3,4,4,4,4,5,5), 
  var = c(0.1,0.0,0.3,0.4,0.8,0.5,0.2,0.3,0.7,0.9,0.2,0.4,0.6)
) 
n <- c(3,5,3,1,3,2,5,3,2)

## Make a 'look-up' data frame that associates sampled groups with new names,
## then use merge to create `newdat`
df <- data.frame(group = n, 
                 newgroup = as.numeric(make.unique(as.character(n))))
newdat <- merge(df, dat)[-1]
names(newdat)[1] <- "group"
##您的数据

我想你是说有替换品的样品吗?每个原始组应该有多少样本?对不起,我没有在新数据中说总样本量(组),比如说20。对于每个原始组,可以随时(随机)选择。您好,您建议使用一些基于文本的方法来计算“漂亮”版本。因为我正在处理一个巨大的数据集,所以这样做并不方便。您是否有其他自动解决此问题的建议或想法?谢谢。
make.unique
似乎可以很好地实现这一点。请看@JoshOBrien的回答。是的。它非常简单,新的组号没有问题。比你好多了。@gsk3--没问题。它和
make.names
对我来说都很方便()。
lvls <- unique(dat$group)
gp.orig <- gp.samp <- sample( lvls, n, replace=TRUE ) #this is the actual sampling
library(taRifx)
res <- stack.list(lapply( gp.samp, function(i) dat[dat$group==i,] ))
# Now make your pretty group names
while(any(duplicated(gp.samp))) {
  gp.samp[duplicated(gp.samp)] <- gp.samp[duplicated(gp.samp)] + .1
}
# Replace group with pretty group names (a simple merge doesn't work here because the groups are not unique)
gp.df <- as.data.frame(table(dat$group))
names(gp.df) <- c("group","n")
gp.samp.df <- merge(data.frame(group=gp.orig,pretty=gp.samp,order=seq(length(gp.orig))), gp.df )
gp.samp.df <- sort(gp.samp.df, f=~order)
res$pretty <- with( gp.samp.df, rep(pretty,n))

   group var pretty
6      3 0.5    3.0
7      3 0.2    3.0
12     5 0.4    5.0
13     5 0.6    5.0
61     3 0.5    3.1
71     3 0.2    3.1
62     3 0.5    3.2
72     3 0.2    3.2
3      2 0.3    2.0
4      2 0.4    2.0
5      2 0.8    2.0
## Your data
dat <- data.frame (
  group = c(1,1,2,2,2,3,3,4,4,4,4,5,5), 
  var = c(0.1,0.0,0.3,0.4,0.8,0.5,0.2,0.3,0.7,0.9,0.2,0.4,0.6)
) 
n <- c(3,5,3,1,3,2,5,3,2)

## Make a 'look-up' data frame that associates sampled groups with new names,
## then use merge to create `newdat`
df <- data.frame(group = n, 
                 newgroup = as.numeric(make.unique(as.character(n))))
newdat <- merge(df, dat)[-1]
names(newdat)[1] <- "group"