使用tm_映射将文本转换为小写时出错(…,tolower)
我试着使用使用tm_映射将文本转换为小写时出错(…,tolower),r,tm,lowercase,term-document-matrix,R,Tm,Lowercase,Term Document Matrix,我试着使用tm\u地图。它给出了以下错误。我怎样才能避开这件事 require(tm) byword<-tm_map(byword, tolower) Error in UseMethod("tm_map", x) : no applicable method for 'tm_map' applied to an object of class "character" require(tm) byword使用基本R函数tolower(): myCorpus以这种方式使用tol
tm\u地图
。它给出了以下错误。我怎样才能避开这件事
require(tm)
byword<-tm_map(byword, tolower)
Error in UseMethod("tm_map", x) :
no applicable method for 'tm_map' applied to an object of class "character"
require(tm)
byword使用基本R函数tolower()
:
myCorpus以这种方式使用tolower会产生不良的副作用:如果您稍后尝试从语料库中创建术语文档矩阵,它将失败。这是因为tm最近发生了变化,无法处理tolower的返回类型。相反,请使用:
myCorpus <- tm_map(myCorpus, PlainTextDocument)
myCorpus在此扩展my以获得更详细的答案:您必须将tolower
包装在content\u transformer
中,以避免将VCorpus
对象搞砸——类似于:
> library(tm)
> data('crude')
> crude[[1]]$content
[1] "Diamond Shamrock Corp said that\neffective today it had cut its contract prices for crude oil by\n1.50 dlrs a barrel.\n The reduction brings its posted price for West Texas\nIntermediate to 16.00 dlrs a barrel, the copany said.\n \"The price reduction today was made in the light of falling\noil product prices and a weak crude oil market,\" a company\nspokeswoman said.\n Diamond is the latest in a line of U.S. oil companies that\nhave cut its contract, or posted, prices over the last two days\nciting weak oil markets.\n Reuter"
> tm_map(crude, content_transformer(tolower))[[1]]$content
[1] "diamond shamrock corp said that\neffective today it had cut its contract prices for crude oil by\n1.50 dlrs a barrel.\n the reduction brings its posted price for west texas\nintermediate to 16.00 dlrs a barrel, the copany said.\n \"the price reduction today was made in the light of falling\noil product prices and a weak crude oil market,\" a company\nspokeswoman said.\n diamond is the latest in a line of u.s. oil companies that\nhave cut its contract, or posted, prices over the last two days\nciting weak oil markets.\n reuter"
谢谢但是我为什么会犯这样的错误呢?我可能需要使用其他tm_地图应用程序!tm\u map
的帮助文件(在软件包tm
中)显示了可用转换函数的列表,tolower
不是其中之一。这些转换似乎是对“语料库”类对象进行操作的S3方法。因此,你不能只使用tm\u map
的任何函数。你应该将tolower
包装在content\u transformer
的内部,不要把VCorpus
对象搞砸,比如:tm\u map(myCorpus,content\u transformer(tolower))
@daroczig:请回答这个问题@smci感谢您的想法,我刚刚提交了上述评论作为以下新答案:)tm_地图是从哪个软件包来的?这似乎取决于一些非基本包。请考虑包含<代码>库>代码>语句的完整性。@丹尼尔克利兹:<代码> TMYMAP()>代码>源于<代码> TM >代码>包,<代码> ToWORE()/<代码>源于<代码> Base<代码>
myCorpus <- tm_map(myCorpus, PlainTextDocument)
> library(tm)
> data('crude')
> crude[[1]]$content
[1] "Diamond Shamrock Corp said that\neffective today it had cut its contract prices for crude oil by\n1.50 dlrs a barrel.\n The reduction brings its posted price for West Texas\nIntermediate to 16.00 dlrs a barrel, the copany said.\n \"The price reduction today was made in the light of falling\noil product prices and a weak crude oil market,\" a company\nspokeswoman said.\n Diamond is the latest in a line of U.S. oil companies that\nhave cut its contract, or posted, prices over the last two days\nciting weak oil markets.\n Reuter"
> tm_map(crude, content_transformer(tolower))[[1]]$content
[1] "diamond shamrock corp said that\neffective today it had cut its contract prices for crude oil by\n1.50 dlrs a barrel.\n the reduction brings its posted price for west texas\nintermediate to 16.00 dlrs a barrel, the copany said.\n \"the price reduction today was made in the light of falling\noil product prices and a weak crude oil market,\" a company\nspokeswoman said.\n diamond is the latest in a line of u.s. oil companies that\nhave cut its contract, or posted, prices over the last two days\nciting weak oil markets.\n reuter"