使用rvest从特定html页面中刮取评论_R_Rvest

使用rvest从特定html页面中刮取评论

使用rvest从特定html页面中刮取评论,r,rvest,R,Rvest,我正在抓取页面以获取评论和用户评论。我正在使用选择器小工具获取css标签。到目前为止，我所做的事情是： teambhp <- read_html("http://www.team-bhp.com/forum/official-new-car-reviews/171841-tata-safari-storme-varicor-400-official-review.html") titles <- teambhp %>% html_node("hr+ div , i ,stron

我正在抓取页面以获取评论和用户评论。我正在使用选择器小工具获取css标签。到目前为止，我所做的事情是：

teambhp <- read_html("http://www.team-bhp.com/forum/official-new-car-reviews/171841-tata-safari-storme-varicor-400-official-review.html")
titles <- teambhp %>% html_node("hr+ div , i ,strong u , #posts ") %>% html_text()

我希望所有23个都保存在列表中。我该怎么做？

请参阅

帮助（“html\u节点”）

：

html\u节点与html\u节点

html_节点类似于[[它总是只提取一个元素。当给定一个节点列表时，html_节点将始终返回相同长度的列表，html_节点的长度可能更长或更短

您需要将其替换为

html\u nodes（）

（注意s）：

titles%html_节点（“hr+div，i，strong u，#posts”）%%>%html_text（）

请参阅

帮助（“html\u节点”）

：

html\u节点与html\u节点

html_节点类似于[[它总是只提取一个元素。当给定一个节点列表时，html_节点将始终返回相同长度的列表，html_节点的长度可能更长或更短

您需要将其替换为

html\u nodes（）

（注意s）：

titles%html_节点（“hr+div，i，strong u，#posts”）%%>%html_text（）

Warning message:
In node_find_one(x$node, x$doc, xpath = xpath, nsMap = ns) :
23 matches for .//hr/following-sibling::*[name() = 'div' and (position() = 1)] | .//i | .//strong/descendant-or-self::*/u | .//*[@id = 'posts']: 
using first

titles <- teambhp %>% html_nodes("hr+ div , i ,strong u , #posts ") %>% html_text()