使用tm_映射将文本转换为小写时出错(…,tolower)

使用tm_映射将文本转换为小写时出错(…,tolower),r,tm,lowercase,term-document-matrix,R,Tm,Lowercase,Term Document Matrix,我试着使用tm\u地图。它给出了以下错误。我怎样才能避开这件事 require(tm) byword<-tm_map(byword, tolower) Error in UseMethod("tm_map", x) : no applicable method for 'tm_map' applied to an object of class "character" require(tm) byword使用基本R函数tolower(): myCorpus以这种方式使用tol

我试着使用
tm\u地图
。它给出了以下错误。我怎样才能避开这件事

 require(tm)
 byword<-tm_map(byword, tolower)

Error in UseMethod("tm_map", x) : 
  no applicable method for 'tm_map' applied to an object of class "character"
require(tm)

byword使用基本R函数
tolower()


myCorpus以这种方式使用tolower会产生不良的副作用:如果您稍后尝试从语料库中创建术语文档矩阵,它将失败。这是因为tm最近发生了变化,无法处理tolower的返回类型。相反,请使用:

myCorpus <- tm_map(myCorpus, PlainTextDocument)
myCorpus在此扩展my以获得更详细的答案:您必须将
tolower
包装在
content\u transformer
中,以避免将
VCorpus
对象搞砸——类似于:

> library(tm)
> data('crude')
> crude[[1]]$content
[1] "Diamond Shamrock Corp said that\neffective today it had cut its contract prices for crude oil by\n1.50 dlrs a barrel.\n    The reduction brings its posted price for West Texas\nIntermediate to 16.00 dlrs a barrel, the copany said.\n    \"The price reduction today was made in the light of falling\noil product prices and a weak crude oil market,\" a company\nspokeswoman said.\n    Diamond is the latest in a line of U.S. oil companies that\nhave cut its contract, or posted, prices over the last two days\nciting weak oil markets.\n Reuter"
> tm_map(crude, content_transformer(tolower))[[1]]$content
[1] "diamond shamrock corp said that\neffective today it had cut its contract prices for crude oil by\n1.50 dlrs a barrel.\n    the reduction brings its posted price for west texas\nintermediate to 16.00 dlrs a barrel, the copany said.\n    \"the price reduction today was made in the light of falling\noil product prices and a weak crude oil market,\" a company\nspokeswoman said.\n    diamond is the latest in a line of u.s. oil companies that\nhave cut its contract, or posted, prices over the last two days\nciting weak oil markets.\n reuter"

谢谢但是我为什么会犯这样的错误呢?我可能需要使用其他tm_地图应用程序!
tm\u map
的帮助文件(在软件包
tm
中)显示了可用转换函数的列表,
tolower
不是其中之一。这些转换似乎是对“语料库”类对象进行操作的S3方法。因此,你不能只使用
tm\u map
的任何函数。你应该将
tolower
包装在
content\u transformer
的内部,不要把
VCorpus
对象搞砸,比如:
tm\u map(myCorpus,content\u transformer(tolower))
@daroczig:请回答这个问题@smci感谢您的想法,我刚刚提交了上述评论作为以下新答案:)tm_地图是从哪个软件包来的?这似乎取决于一些非基本包。请考虑包含<代码>库>代码>语句的完整性。@丹尼尔克利兹:<代码> TMYMAP()>代码>源于<代码> TM >代码>包,<代码> ToWORE()/<代码>源于<代码> Base<代码>
myCorpus <- tm_map(myCorpus, PlainTextDocument)
> library(tm)
> data('crude')
> crude[[1]]$content
[1] "Diamond Shamrock Corp said that\neffective today it had cut its contract prices for crude oil by\n1.50 dlrs a barrel.\n    The reduction brings its posted price for West Texas\nIntermediate to 16.00 dlrs a barrel, the copany said.\n    \"The price reduction today was made in the light of falling\noil product prices and a weak crude oil market,\" a company\nspokeswoman said.\n    Diamond is the latest in a line of U.S. oil companies that\nhave cut its contract, or posted, prices over the last two days\nciting weak oil markets.\n Reuter"
> tm_map(crude, content_transformer(tolower))[[1]]$content
[1] "diamond shamrock corp said that\neffective today it had cut its contract prices for crude oil by\n1.50 dlrs a barrel.\n    the reduction brings its posted price for west texas\nintermediate to 16.00 dlrs a barrel, the copany said.\n    \"the price reduction today was made in the light of falling\noil product prices and a weak crude oil market,\" a company\nspokeswoman said.\n    diamond is the latest in a line of u.s. oil companies that\nhave cut its contract, or posted, prices over the last two days\nciting weak oil markets.\n reuter"