使用R'；什么是quanteda？_R_Sentiment Analysis_Word Cloud_Quanteda

使用R'；什么是quanteda？

使用R'；什么是quanteda？,r,sentiment-analysis,word-cloud,quanteda,R,Sentiment Analysis,Word Cloud,Quanteda,我有一组评论（文字评论+0-10的评级），我想在R中创建一个情感词云，其中：一个单词的大小代表它的频率一个词的颜色代表它出现的所有评论的平均评分（最好是绿色-黄色-红色的渐变色）我使用quanteda创建注释的dfm。现在我想使用textplot\u wordcloud函数，我想我需要做以下工作：对于每一个单词，获取它出现的所有评论计算此审查子集的平均评分除以10可缩放为0-1，并将该值指定给该单词按平均评分对单词进行排序（以便正确分配颜色？）使用color=RColorBr

我有一组评论（文字评论+0-10的评级），我想在R中创建一个情感词云，其中：

一个单词的大小代表它的频率
一个词的颜色代表它出现的所有评论的平均评分（最好是绿色-黄色-红色的渐变色）

我使用quanteda创建注释的

dfm

。现在我想使用

textplot\u wordcloud

函数，我想我需要做以下工作：

对于每一个单词，获取它出现的所有评论

计算此审查子集的平均评分

除以10可缩放为0-1，并将该值指定给该单词

按平均评分对单词进行排序（以便正确分配颜色？）

使用

color=RColorBrewer:：brewer.pal（11，“RdYlGn”）

根据平均评分计算颜色

我在第1步和第4步遇到问题。其余的应该是可行的。有人能解释一下dfm是如何容易读取和操作的吗？

我发现了一种使用矩阵乘法的有效方法：基本上功能是

sw=sd*C/Nw

，其中：

```
sw
```
=每个词的情绪
```
sd
```
=每个文档的评级
```
C
```
=每个文档字频矩阵
```
Nw
```
=每个单词出现的次数

代码：

# create the necessary variables
sd <- as.integer(df$rating)
C <- as.matrix(my_dfm)
Nw <- as.integer(colSums(C))

# calculate the word sentiment
sw <- as.integer(s_d %*% C) / n_w

# normalize the word sentiment to values between 0 and 1
sw <- (sw - min(sw)) / (max(sw) - min(sw)

# make a function that converts a sentiment value to a color
num_to_color <- seq_gradient_pal(low="#FF0000", high="#00FF00")

# apply the function to the sentiment values
word_colors <- num_to_color(sw)

# create a new window; 
# before executing the next command, manually maximize in order to get a better readable wordcloud
dev.new()

# create the wordcloud with the calculated color values
textplot_wordcloud(my_dfm, color=word_colors)

#创建必要的变量
sd