R中数据帧的随机样本

R中数据帧的随机样本,r,R,我有以下数据框: id<-c(1,1,2,3,3) date<-c("23-01-08","01-11-07","30-11-07","17-12-07","12-12-08") df<-data.frame(id,date) df$date2<-as.Date(as.character(df$date), format = "%d-%m-%y") id date date2 1 23-01-08 2008-01-23 1 01-11-07 2

我有以下数据框:

id<-c(1,1,2,3,3)
date<-c("23-01-08","01-11-07","30-11-07","17-12-07","12-12-08")
df<-data.frame(id,date)
df$date2<-as.Date(as.character(df$date), format = "%d-%m-%y")

id     date      date2
1   23-01-08 2008-01-23
1   01-11-07 2007-11-01
2   30-11-07 2007-11-30
3   17-12-07 2007-12-17
3   12-12-08 2008-12-12

任何帮助都将不胜感激。

首先,您必须生成示例索引:

s_ids=sample(unique(df$id),2)
现在,您已经在df中选择了正确的记录

new_df=df$[df$id %in% s_ids,]

您可以使用
sample()


您可以使用
sample
函数

set.seed(2)
df[match(sample(unique(df$id),2),df$id),]
sample()
函数将生成随机索引,然后您可以将它们与
df
数据帧行匹配,并获取其余数据。
有关更多信息,请检查样本或使用dplyr

chosen <- sample(unique(df$id), 2)
library(dplyr)
df %>% 
    filter(id %in% sample(unique(id),2))
#  id     date      date2
#1  2 30-11-07 2007-11-30
#2  3 17-12-07 2007-12-17
#3  3 12-12-08 2008-12-12

使用sqldf:

library(sqldf)
a <- sqldf("SELECT DISTINCT id FROM df  ORDER BY RANDOM(*) LIMIT 2")
sqldf("SELECT * FROM df WHERE id IN a")

如果您有重复的id值,这将不起作用。也就是说,在当前数据中,您可能会选择
1
两次。这不会给出预期的结果-您总是返回5行。更新了答案
library(dplyr)
df %>% 
    filter(id %in% sample(unique(id),2))
#  id     date      date2
#1  2 30-11-07 2007-11-30
#2  3 17-12-07 2007-12-17
#3  3 12-12-08 2008-12-12
df %>%
     select(id) %>%
     unique() %>%
     sample_n(2) %>%
     semi_join(df, .)
#  id     date      date2
#1  1 23-01-08 2008-01-23
#2  1 01-11-07 2007-11-01
#3  2 30-11-07 2007-11-30
library(sqldf)
a <- sqldf("SELECT DISTINCT id FROM df  ORDER BY RANDOM(*) LIMIT 2")
sqldf("SELECT * FROM df WHERE id IN a")
  id     date      date2
1  1 23-01-08 2008-01-23
2  1 01-11-07 2007-11-01
3  3 17-12-07 2007-12-17
4  3 12-12-08 2008-12-12