Dataframe R中的分层数据帧-->;管理空值

Dataframe R中的分层数据帧-->;管理空值,dataframe,stratifiedjs,Dataframe,Stratifiedjs,我想在数据帧上执行两列采样。我正在研究非常小的概率,最后我遇到了一个问题。这是我的方法 library(splitstackshape) #Creation of a dataframe similar to the one I'm working on. data1 <- data.frame(categorie_metier = sample(c("agriculteur", "artisan", "autre", &quo

我想在数据帧上执行两列采样。我正在研究非常小的概率,最后我遇到了一个问题。这是我的方法

library(splitstackshape)

#Creation of a dataframe similar to the one I'm working on.
data1 <- data.frame(categorie_metier = sample(c("agriculteur", "artisan", "autre", "cadres", "employes", "ouvriers", "prof_int"), 429, replace = TRUE, prob = c(0.01, 0.05, 0.14, 0.41, 0.25, 0.04, 0.10)), en_teletravail = sample(c("0", "1"), 429, replace = TRUE, prob = c(0.59, 0.41)), stringsAsFactors = TRUE)
        
#Creation of a dataframe to simulate my probabilities.
data2 <- data.frame(categorie_metier = sample(c("agriculteur", "artisan", "autre", "cadres", "employes", "ouvriers", "prof_int"), 1000000, replace = TRUE, prob = c(0.01, 0.03, 0.27, 0.21, 0.13, 0.10, 0.25)), en_teletravail = sample(c("0", "1"), 1000000, replace = TRUE, prob = c(0.991, 0.009)), stringsAsFactors = TRUE)
        
#Grouping of columns.
data2$groupe <- paste(data2$categorie_metier, data2$en_teletravail)
        
#Extraction of groups in a variable. Objective: Create an output dataframe of 50 lines.
gsize <- 50 * round(prop.table(table(data2$groupe)), 2)
gsize = as.list(gsize)
        
#Generation of the output dataframe.
data3 <- stratified(data1, c("categorie_metier", "en_teletravail"), gsize)
Error in stratified(data1, c("categorie_metier", "en_teletravail"), gsize) : 
Incompatible sizes supplied
库(splitstackshape)
#创建与我正在处理的数据帧类似的数据帧。
数据1