R(igraph)中的子图文本分析
我很好奇如何访问与边关联的图的附加属性。下面是一个简单的例子:R(igraph)中的子图文本分析,r,igraph,text-analysis,R,Igraph,Text Analysis,我很好奇如何访问与边关联的图的附加属性。下面是一个简单的例子: library("igraph") library("SocialMediaLab") myapikey ='' myapisecret ='' myaccesstoken = '' myaccesstokensecret = '' tweets <- Authenticate("twitter", apiKey = myapikey,
library("igraph")
library("SocialMediaLab")
myapikey =''
myapisecret =''
myaccesstoken = ''
myaccesstokensecret = ''
tweets <- Authenticate("twitter",
apiKey = myapikey,
apiSecret = myapisecret,
accessToken = myaccesstoken,
accessTokenSecret = myaccesstokensecret) %>%
Collect(searchTerm="#trump", numTweets = 100,writeToFile=FALSE,verbose=TRUE)
g_twitter_actor <- tweets %>% Create("Actor", writeToFile=FALSE)
c <- igraph::components(g_twitter_actor, mode = 'weak')
subCluster <- induced.subgraph(g_twitter_actor, V(g_twitter_actor)[which(c$membership == which.max(c$csize))])
如何访问子图的text属性以执行文本分析?
E(子集群)$text
不起作用E(子集群)$text
不起作用,因为创建时,tweets$text
的值未添加到图形中。因此,您必须手动执行此操作。这有点痛苦,但可行。需要对tweets
数据帧进行一些子集设置,并基于用户名进行匹配
首先,请注意边缘类型是按特定顺序排列的:转发、提及、回复。来自特定用户的相同文本可以应用于这三个方面。所以我认为连续添加文本是有意义的
> unique(E(g_twitter_actor)$edgeType)
[1] "Retweet" "Mention" "Reply"
使用dplry
和restrape2
可以简化此操作
library(reshape2); library(dplyr)
#Make data frame for retweets, mentions, replies
rts <- tweets %>% filter(!is.na(retweet_from))
ms <- tweets %>% filter(users_mentioned!="character(0)")
rpls <- tweets %>% filter(!is.na(reply_to))
现在,通过匹配边缘类型,将其中的每一个作为边缘属性添加到网络中
E(g_twitter_actor)$text[E(g_twitter_actor)$edgeType %in% "Retweet"] <- rts$text
E(g_twitter_actor)$text[E(g_twitter_actor)$edgeType %in% "Mention"] <- ms$text
E(g_twitter_actor)$text[E(g_twitter_actor)$edgeType %in% "Reply"] <- rpls$text
E(g_twitter_actor)$text[E(g_twitter_actor)$edgeType%in%“Retweet”]非常感谢,你知道我如何解决这个问题吗?
#Name each element in the users_mentioned list after the user who mentioned
names(ms$users_mentioned) <- ms$screen_name
ms <- melt(ms$users_mentioned) #melting creates a data frame for each user and the users they mention
#Add the text
ms$text <- tweets[match(ms$L1,tweets$screen_name),1]
E(g_twitter_actor)$text[E(g_twitter_actor)$edgeType %in% "Retweet"] <- rts$text
E(g_twitter_actor)$text[E(g_twitter_actor)$edgeType %in% "Mention"] <- ms$text
E(g_twitter_actor)$text[E(g_twitter_actor)$edgeType %in% "Reply"] <- rpls$text
subCluster <- induced.subgraph(g_twitter_actor,
V(g_twitter_actor)[which(c$membership == which.max(c$csize))])