从字符串向量到R中的模型矩阵
我有一个16163个字符串的向量,如下所示:从字符串向量到R中的模型矩阵,r,model.matrix,R,Model.matrix,我有一个16163个字符串的向量,如下所示: sentencevector <- c('decided clean debt get finances together Thank consideration', 'I stable job I will never get laid I fixed', 'Using pay existing loans credit card debt All higher', 'Substantially lower giving peace mind
sentencevector <- c('decided clean debt get finances together Thank consideration',
'I stable job I will never get laid I fixed',
'Using pay existing loans credit card debt All higher',
'Substantially lower giving peace mind My job stable'...)
Data <- data.frame(
X = c('decided clean debt get finances together thank consideration'...),
decided = 1,
lean = 1,
dance = 0,
debt=1 ,...)
sentencevector使用DocumentTermMatrix或TermDocumentMatrix:
你必须假设每个句子都是一个文档。尝试将整个句子数据框发送到此函数。之后,您可以使用自己的过滤器提取正在搜索的数据。例如,类似于的内容,如果val>0,则1 else 0
这里有一个有点复杂的教程:看看tm
软件包。您希望创建一个术语文档矩阵
df <-setNames(data.frame(matrix(ncol = length(universe), nrow = length(sentencevector)), universe)