Stopwords可以';t删除R中的单词

Stopwords可以';t删除R中的单词,r,tm,R,Tm,我尝试使用包含停止词的文本删除单词。但这种情况正在发生 library(corpus) library(tm) tokpedClean <- read.csv("D:/AS/tokpedClean5.csv") head(tokpedClean) tokpedCleanCor = Corpus(VectorSource(tokpedClean$text)) removeURL <- function(x) gsub("http[^[:space:]]*", "", x) docsC

我尝试使用包含停止词的文本删除单词。但这种情况正在发生

library(corpus)
library(tm)
tokpedClean <- read.csv("D:/AS/tokpedClean5.csv")
head(tokpedClean)
tokpedCleanCor = Corpus(VectorSource(tokpedClean$text))

removeURL <- function(x) gsub("http[^[:space:]]*", "", x)
docsClean <- tm_map(docs, removeURL)
inspect(docsClean[1:5])
下一步是停止词

cStopwordID <- readLines("D:/AS/swID.csv")
stop <- tm_map(docsClean, removeWords, cStopwordID)

cStopwordID我认为您应该首先转换为
DocumentTermMatrix()
您的停止字需要是一个向量,而不是一个data.frame或一个列表。
cStopwordID <- readLines("D:/AS/swID.csv")
stop <- tm_map(docsClean, removeWords, cStopwordID)