R语言中句子中的精确匹配词和词典中的精确匹配词
我有以下数据框,其中包含代表字典的句子和词性/否定词:R语言中句子中的精确匹配词和词典中的精确匹配词,r,R,我有以下数据框,其中包含代表字典的句子和词性/否定词: sent <- data.frame(words = c("just right size and i love this notebook", "benefits great laptop", "wouldnt bad notebook", "very good quality", "orgtop", "great improveme
sent <- data.frame(words = c("just right size and i love this notebook", "benefits great laptop",
"wouldnt bad notebook", "very good quality", "orgtop",
"great improvement for that bad product but overall is not good", "notebook is not good but i love batterytop"), user = c(1,2,3,4,5,6,7),
stringsAsFactors=F)
posWords <- c("great","improvement","love","great improvement","very good","good","right","very","benefits",
"extra","benefit","top","extraordinarily","extraordinary","super","benefits super","good","benefits great")
negWords <- c("hate","bad","not good","horrible")
在每一个句子中都有独特的词语。这是一个问题,当我不能做精确匹配
拜托,谁能帮我一下吗。我真的不知道该怎么做。我将感谢你的任何帮助或建议。提前非常感谢。已经为R实现了许多极性算法,包括Matthew Jockeys新的GitHub包。你看过现有的实现吗?是的,我看过github,浏览了很多网页,没有发现任何类似的东西:-(
counter <- 0
dataOut <- ldply(strsplit(as.character(sent$words), " "),
function(x) {
counter <<- counter + 1
p = which(x %in% posWords)
n = which(x %in% negWords)
positive <- vapply(p, function(i) paste0(c(x[i - 2], x[i - 1], x[i], x[i + 1], x[i + 2]), collapse = " "), character(1))
negative <- vapply(n, function(i) paste0(c(x[i - 2], x[i - 1], x[i], x[i + 1], x[i + 2]), collapse = " "), character(1))
if(length(positive) > 0 | length(negative) > 0) {
cbind(user = counter, word = c(positive, negative), val = rep(c(1, -1), c(length(p), length(n))))
}
})
strsplit(as.character(sent$words), " ")