Regex 在R中搜索句子中的单词
我想请你对以下内容提出建议。我有一个数据框:Regex 在R中搜索句子中的单词,regex,r,Regex,R,我想请你对以下内容提出建议。我有一个数据框: reviews <- data.frame(value = c("Product was received in excellent condition. Made with high quality materials. Very Good product", "Inexpensive. An improvement over integrated graphics.",
reviews <- data.frame(value = c("Product was received in excellent condition. Made with high quality materials. Very Good product",
"Inexpensive. An improvement over integrated graphics.",
"I love that product so excite. I will order again if I need more .",
"Excellent card, great graphics."),
user = c(1,2,3,4),
Review_Id = c("101968","101968","210546","112546"))
任何建议或方法都将不胜感激。非常感谢转发。您可以试试
merge.data.frame(x = topics, y = reviews, by = c("Review_Id"), all.x = TRUE, all.y = FALSE)
你试过合并吗?i、 e.
合并(主题、评论)
在主题
中,相同的评论Id链接到两个不同的用户是否正常?否则,您可以尝试merge.data.frame(x=topics,y=reviews,by=c(“Review\u Id”),all.x=TRUE,all.y=FALSE)
,或者merge.data.frame(x=topics,y=reviews,by=c(“Review\u Id”,“user”),all.x=TRUE,all.y=FALSE)
一旦修复了双用户问题,我就这么做了。当我使用merge时,我将所有句子都放在一行中,但我只需要包含特定主题的句子。是的,相同的评论Id可以链接到两个不同的用户。问题是我只需要一个包含特定主题的句子。有什么想法吗?所需的输出就是我需要的。假设您在reviews数据框中添加stringsAsFactors=FALSE
,下面的代码返回一个逻辑向量,给出包含第一个主题的第一篇评论的所有句子:grepl(topics$topic[1],strsplit(reviews$value[1],'.',fixed=TRUE)[[1]])
。剩下的应该是直截了当的。谢谢你,这是我在使用智能正则表达式提取包含主题的特定句子之前需要用到的。评论:请注意你文章的质量。
topic user Review_Id review
product 1 101968 Product was received in excellent condition.
condition 1 101968 Product was received in excellent condition.
materials 1 101968 Made with high quality materials.
product 1 101968 Very Good product
integrated graphics 2 101968 An improvement over integrated graphics.
product 3 210546 I love that product so excite.
card 4 112546 Excellent card, great graphics.
graphics 4 112546 Excellent card, great graphics.
merge.data.frame(x = topics, y = reviews, by = c("Review_Id"), all.x = TRUE, all.y = FALSE)