R创建列出所有因素级别的数据框-更好的方法？_R_Dataframe

R创建列出所有因素级别的数据框-更好的方法？

r dataframe

R创建列出所有因素级别的数据框-更好的方法？,r,dataframe,R,Dataframe,我想生成一个包含以下列的数据框：来自另一个数据帧的每个因子的名称每个因素的每个层次对应的级别编号我最终能够编写出下面的代码，它几乎可以正常工作，但它似乎有点复杂（我的R经验相当有限，而且涉及到很多谷歌搜索）。我的代码有什么问题，有没有更好的方法以相同的格式生成相同的输出 mydata <- iris #Get vector of column types type <- sapply(mydata,class) # Filter out just the ones that

我想生成一个包含以下列的数据框：

来自另一个数据帧的每个因子的名称

每个因素的每个层次

对应的级别编号我最终能够编写出下面的代码，它几乎可以正常工作，但它似乎有点复杂（我的R经验相当有限，而且涉及到很多谷歌搜索）。我的代码有什么问题，有没有更好的方法以相同的格式生成相同的输出

mydata <- iris

#Get vector of column types
type <- sapply(mydata,class)
# Filter out just the ones that are factors
factors = type[type=="factor"]
# Allocate a vector to hold 1 data frame per factor
listOfFactors <- vector(mode = "list", length = length(factors))

# For each factor, list all the levels of that factor, and the level number
for (j in 1:length(factors)) {
    cur_colname <- names(factors[j])
    cur_colnum <- which(colnames(mydata)==cur_colname)
    cur_nlevels <- nlevels(mydata[,cur_colnum])
    listOfFactors[[j]] <- data.frame(VarName=character(cur_nlevels),
                                     Level=character(cur_nlevels),
                                     Number=integer(cur_nlevels),
                                     stringsAsFactors=FALSE
                                     )
    for (i in 1:cur_nlevels) {
          cur_level <- levels(mydata[,cur_colnum])[i]
        listOfFactors[[j]]$VarName[i] <- cur_colname
        listOfFactors[[j]]$Level[i] <- cur_level
        listOfFactors[[j]]$Number[i] <- i
    }
}

allfactorlevels <- do.call("rbind", listOfFactors)

mydata代码的主要问题是没有使用矢量化操作。从其他语言转换时可能会很棘手，但是for循环在R中几乎永远都不是答案，特别是当您使用它们一次访问一个向量/列表/数据帧的元素时。我保留了代码的第一部分，然后在获得输出时采用了（更）谨慎的方法
type <- sapply(mydata,class)
factors = type[type=="factor"]

使用dplyr
函数的快速方法：选择因子变量，为每个变量创建因子级别和数字的数据框，然后将这些数据框绑定在一起purrr:：map_dfr
将执行此迭代，并向生成的数据帧添加一个ID变量；在本例中，它是原始变量的名称
为了更好地说明和测试，我在数据中添加了另一个因子列
set.seed（1）
图书馆（dplyr）
mydata%
变异（组=as.因子（样本（字母[1:4]，nrow（.），replace=TRUE）））
mydata%>%
选择（其中（是系数））%>%
purrr:：map_dfr（函数（f）{
数据帧（级别=级别（f），
编号=沿（标高（f））的序号
}，.id=“VarName”）
#>VarName级别编号
#>刚毛1种
#>花色2种
#>弗吉尼亚3种
#>4 a组1
#>5 b组2
#>6组c 3
#>7组d 4
查看expand.grid。对象res是否以某种方式位于函数定义的本地？
output <- lapply(names(factors),function(x){
  res <- data.frame(VarName=x, 
                    Level=levels(mydata[,x]), 
                    Number=1:nlevels(mydata[,x]))
  return(res)
})

do.call(rbind, output)