Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/assembly/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
用R提取ngrams_R_Text Mining - Fatal编程技术网

用R提取ngrams

用R提取ngrams,r,text-mining,R,Text Mining,我正试图使用ngramrr包从涅磐文本中提取3克 require(ngramrr) require(tm) require(magrittr) nirvana <- c("hello hello hello how low", "hello hello hello how low", "hello hello hello how low", "hello hello hello", "with the lights out", "it'

我正试图使用
ngramrr
包从涅磐文本中提取
3克

require(ngramrr)
require(tm)
require(magrittr)

nirvana <- c("hello hello hello how low", "hello hello hello how low",
             "hello hello hello how low", "hello hello hello",
             "with the lights out", "it's less dangerous", "here we are now",
             "entertain us", "i feel stupid", "and contagious", "here we are now", 
             "entertain us", "a mulatto", "an albino", "a mosquito", "my libido",
             "yeah", "hey yay")

ngramrr(nirvana[1], ngmax = 3)

Corpus(VectorSource(nirvana))
我想知道我能做些什么来构建
TermDocumentMatrix
其中术语是
trig
列表


谢谢你

我上面的评论几乎是完整的,但它是这样的:

nirvana %>% tokens(ngrams = 1:3) %>% # generate tokens
  dfm %>% # generate dfm
  convert(to = "tm") %>% # convert to tm's document-term-matrix
  t # transpose it to term-document-matrix

我会使用
quanteda
并转换成
tm
格式
nirvana%>%代币(ngrams=1:3)%%>%dfm%>%convert(to=“tm”)
@amatsuo\u net谢谢,你能帮我举一个R示例吗?@Cath谢谢;)
nirvana %>% tokens(ngrams = 1:3) %>% # generate tokens
  dfm %>% # generate dfm
  convert(to = "tm") %>% # convert to tm's document-term-matrix
  t # transpose it to term-document-matrix