Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/r/73.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
R STM:从tm转换为STM文档术语矩阵时如何保存元数据?_R_Matrix_Tm_Topic Modeling_Text Analysis - Fatal编程技术网

R STM:从tm转换为STM文档术语矩阵时如何保存元数据?

R STM:从tm转换为STM文档术语矩阵时如何保存元数据?,r,matrix,tm,topic-modeling,text-analysis,R,Matrix,Tm,Topic Modeling,Text Analysis,我试图在使用tmpackage准备的文档术语矩阵上运行结构化主题模型(使用stmpackage) 我在tm包中构建了一个语料库,其中包含以下元数据: library(tm) myReader2 <- readTabular(mapping=list(content="text", id="id", sentiment = "sentiment")) text_corpus2 <- VCorpus(DataframeSource(bin_stm_df), readerControl

我试图在使用
tm
package准备的文档术语矩阵上运行结构化主题模型(使用
stm
package)

我在
tm
包中构建了一个语料库,其中包含以下元数据:

library(tm)

myReader2 <- readTabular(mapping=list(content="text", id="id", sentiment = "sentiment"))
text_corpus2 <- VCorpus(DataframeSource(bin_stm_df), readerControl = list(reader = myReader2))

meta(text_corpus2[[1]])
  id       : 11
  sentiment: negative
  language : en
到目前为止,一切顺利。但是,当我尝试使用
stm
-兼容数据指定元数据时,元数据消失了:

docsTM <- DTM_st$documents # works fine
vocabTM <- DTM_st$vocab # works fine
metaTM <- DTM_st$meta # returns NULL

> metaTM
NULL

docsTM试试quanteda软件包怎么样

如果无法访问您的对象,我无法保证它可以一字不差地工作,但它应该:

library("quanteda")

# creates the corpus with document variables except for the "text"
text_corpus3 <- corpus(bin_stm_df, text_field = "text")

# convert to document-feature matrix - cleaning options can be added
# see ?tokens
chat_DTM3 <- dfm(text_corpus3)

# similar to tm::removeSparseTerms()
DTM3 <- dfm_trim(chat_DTM3, sparsity = 0.990)

# convert to STM format
DTM_st <- convert(DTM3, to = "stm")

# then it's all there
docsTM <- DTM_st$documents 
vocabTM <- DTM_st$vocab    
metaTM <- DTM_st$meta      # should return the data.frame of document variables
库(“quanteda”)
#使用除“文本”之外的文档变量创建语料库

大家好,我在结尾的时候就知道了,但是谢谢你们在这里发布了很棒的答案!
library("quanteda")

# creates the corpus with document variables except for the "text"
text_corpus3 <- corpus(bin_stm_df, text_field = "text")

# convert to document-feature matrix - cleaning options can be added
# see ?tokens
chat_DTM3 <- dfm(text_corpus3)

# similar to tm::removeSparseTerms()
DTM3 <- dfm_trim(chat_DTM3, sparsity = 0.990)

# convert to STM format
DTM_st <- convert(DTM3, to = "stm")

# then it's all there
docsTM <- DTM_st$documents 
vocabTM <- DTM_st$vocab    
metaTM <- DTM_st$meta      # should return the data.frame of document variables