用R提取ngrams_R_Text Mining - Fatal编程技术网

用R提取ngrams

用R提取ngrams,r,text-mining,R,Text Mining,我正试图使用ngramrr包从涅磐文本中提取3克 require(ngramrr) require(tm) require(magrittr) nirvana <- c("hello hello hello how low", "hello hello hello how low", "hello hello hello how low", "hello hello hello", "with the lights out", "it'

我正试图使用

ngramrr

包从涅磐文本中提取

3克
require(ngramrr)
require(tm)
require(magrittr)

nirvana <- c("hello hello hello how low", "hello hello hello how low",
             "hello hello hello how low", "hello hello hello",
             "with the lights out", "it's less dangerous", "here we are now",
             "entertain us", "i feel stupid", "and contagious", "here we are now", 
             "entertain us", "a mulatto", "an albino", "a mosquito", "my libido",
             "yeah", "hey yay")

ngramrr(nirvana[1], ngmax = 3)

Corpus(VectorSource(nirvana))

我想知道我能做些什么来构建TermDocumentMatrix
其中术语是trig
列表
谢谢你
我上面的评论几乎是完整的，但它是这样的：
nirvana %>% tokens(ngrams = 1:3) %>% # generate tokens
  dfm %>% # generate dfm
  convert(to = "tm") %>% # convert to tm's document-term-matrix
  t # transpose it to term-document-matrix

我会使用quanteda
并转换成tm
格式nirvana%>%代币（ngrams=1:3）%%>%dfm%>%convert（to=“tm”）@amatsuo\u net谢谢，你能帮我举一个R示例吗？@Cath谢谢；）
nirvana %>% tokens(ngrams = 1:3) %>% # generate tokens
  dfm %>% # generate dfm
  convert(to = "tm") %>% # convert to tm's document-term-matrix
  t # transpose it to term-document-matrix