用R_R_Sentiment Analysis - Fatal编程技术网

用R

用R,r,sentiment-analysis,R,Sentiment Analysis,我在做情感分析，用一系列的单词对应于1-8的分数范围，而不是把积极的单词算作1，把消极的单词算作-1 以下是列表的一部分： word score laughter 8.50 happiness 8.44 love 8.42 happy 8.30 laughed 8.26 laugh 8.22 如何将此列表应用于感悟.scor

我在做情感分析，用一系列的单词对应于1-8的分数范围，而不是把积极的单词算作1，把消极的单词算作-1

以下是列表的一部分：

word            score   
laughter        8.50    
happiness       8.44    
love            8.42    
happy           8.30    
laughed         8.26    
laugh           8.22

如何将此列表应用于感悟.score函数，以便使用score*字数计数而不是仅字数计数

score.sentiment = function(sentences, new_list, .progress='none')
{
        require(plyr)
        require(stringr)

        # we got a vector of sentences. plyr will handle a list or a vector as an "l" for us
        # we want a simple array of scores back, so we use "l" + "a" + "ply" = laply:
        scores = laply(sentences, function(sentence, terms) {

                # clean up sentences with R's regex-driven global substitute, gsub():
                sentence = gsub('[[:punct:]]', '', sentence)
                sentence = gsub('[[:cntrl:]]', '', sentence)
                sentence = gsub('\\d+', '', sentence)
                # and convert to lower case:
                sentence = tolower(sentence)

                # split into words. str_split is in the stringr package
                word.list = str_split(sentence, '\\s+')
                # sometimes a list() is one level of hierarchy too much
                words = unlist(word.list)

                # compare our words to the dictionaries of positive & negative terms
                words.matches = match(words, terms)

                # match() returns the position of the matched term or NA
                # we just want a TRUE/FALSE:
                words.matches = !is.na(words.matches)

                # how to count the score??
                score = ?????

                return(score)
        }, terms, .progress=.progress )

        scores.df = data.frame(score=scores, text=sentences)
        return(scores.df)
    }

下面是一个例子：

df <- read.table(header=TRUE, text="word            score   
laughter        8.50    
happiness       8.44    
love            8.42    
happy           8.30    
laughed         8.26    
laugh           8.22")
sentence <- "I love happiness"

words <- strsplit(sentence, "\\s+")[[1]]
score <- sum(df$score[match(words, df$word)], na.rm = TRUE)

print(score)
# [1] 16.86

df