R 将因子列表向量拆分为数据帧,每个因子级别具有列

R 将因子列表向量拆分为数据帧,每个因子级别具有列,r,R,我收到的数据如下: tree_uses <- c("Food Fuel Land_benefits Medicines","Food","Food","Food Fuel","Food Fuel","Food") 我发现这很有效: split_factor_cols <- function(x) { temp1 <- strsplit(as.character(x)," ") factor_names <- unique(unlist(temp1))

我收到的数据如下:

tree_uses <- c("Food Fuel Land_benefits Medicines","Food","Food","Food Fuel","Food Fuel","Food")
我发现这很有效:

split_factor_cols <- function(x) {
    temp1 <- strsplit(as.character(x)," ")
    factor_names <- unique(unlist(temp1))
    zz <- length(factor_names)
    df <- data.frame(matrix(NA,nrow=length(x),ncol=zz))
    names(df) <- factor_names

    for(i in 1:zz) {
        df[,i] <- unlist(lapply(temp1,function(y) sum(charmatch(factor_names[i],x=y),na.rm=T)))
    }
return(df)
}
split\u factor\u cols使用tm软件包:

library(tm)

d请提供可复制的示例:
dput(trees$tree\u使用[1:6])
提供可复制的示例数据,我猜您正在寻找“将因子转换为二进制列”
dput(trees$tree\u使用[1:6])
希望现在数据的格式正确。是的,我想要二元列,但是我的数据对于每个观察都有多个因子水平。因此,第一步是将数据拆分为单独的因子列。
split_factor_cols <- function(x) {
    temp1 <- strsplit(as.character(x)," ")
    factor_names <- unique(unlist(temp1))
    zz <- length(factor_names)
    df <- data.frame(matrix(NA,nrow=length(x),ncol=zz))
    names(df) <- factor_names

    for(i in 1:zz) {
        df[,i] <- unlist(lapply(temp1,function(y) sum(charmatch(factor_names[i],x=y),na.rm=T)))
    }
return(df)
}
library(tm)

d <- VCorpus(VectorSource(tree_uses))
dtm <- DocumentTermMatrix(d)

# inspect(dtm)

as.matrix(dtm)
#     Terms
# Docs food fuel land_benefits medicines
#    1    1    1             1         1
#    2    1    0             0         0
#    3    1    0             0         0
#    4    1    1             0         0
#    5    1    1             0         0
#    6    1    0             0         0