Css 当网页爬行（刮削）时，“；李：第n个孩子（n）“；如何将数字n增加+；1._Css_R_Web Scraping_Web Crawler

Css 当网页爬行（刮削）时，“；李：第n个孩子（n）“；如何将数字n增加+；1.

css r web-scraping web-crawler

Css 当网页爬行（刮削）时，“；李：第n个孩子（n）“；如何将数字n增加+；1.,css,r,web-scraping,web-crawler,Css,R,Web Scraping,Web Crawler,我想用r来爬网我的网站。“李：第n个孩子（n）”我希望在这部分n增加1 #cMain > div.section_bestseller > div.wrap_bestseller_rest > ul > li:nth-child(1) > dl > dt > a > strong #cMain > div.section_bestseller > div.wrap_bestseller_rest > ul > li:nt

我想用r来爬网我的网站。“李：第n个孩子（n）”我希望在这部分n增加1

 #cMain > div.section_bestseller > div.wrap_bestseller_rest > ul > li:nth-child(1) > dl > dt > a > strong
 #cMain > div.section_bestseller > div.wrap_bestseller_rest > ul > li:nth-child(2) > dl > dt > a > strong
 #li:nth-child(3),li:nth-child(4) ~ li:nth-child(10)

所以我想总共提取1到10个。我该怎么办

library(rvest)
library(httr)

all.titles <- c()

for (page in 1:10){
  url='http://book.daum.net/bestseller/list.do?categoryID=SP1KOR00000&ymd=2017082&cpId=KY&pageNo='
  url_page <- paste0(url,page)
  reading_html <- read_html(url_page)

  text_nodes <- reading_html %>% html_node('div.section_bestseller') %>% html_nodes('div.wrap_bestseller_rest') %>% html_node('ul') %>% html_node('li:nth-child(1)') %>% html_node('dl')%>% html_node('dt')%>% html_node('a')   
  title <- html_text(text_nodes)
  all.titles<-c(all.titles, title)

  print(page)
}



result<-data.frame(all.titles)

库（rvest）
图书馆（httr）
all.titles%html\u节点（'li:n第（1）个子节点）%%>%html\u节点（'dl'）%%>%html\u节点（'dt'）%%>%html\u节点（'a'））
title您需要一个XML解析器来从站点（）获取信息：
out@PoGibas我在chrome中按下f12，为我想要提取的区域创建了一个复制选择器。那部分在#。正如我在标题中所说，李：第n个孩子（n）这里我想从1到10减去n。我只跑了一个号码。@bri我还是个初学者，我没有完全理解你的评论。如果你有10样东西，比如li:nth child（1），li:nth child（2）~li:nth child（9），li:nth child（10）@和bri Crawling是从网站上获取我想要的东西的过程。（scraping=crawling）我抓取的地址列在上面代码的“url”部分。什么是抓取？？你想下载这个网站的内容吗？@和bri在这个页面上，#cMain>div.section\u bestseller>div.wrap\u bestseller\u rest>ul>li:nth child（1）>dl>dt>a>strong我想在这里下载这个部分。非常感谢您的回复！但在这种情况下，我只能获取一页信息（在URL的末尾有一个页码，我还需要最多10页）
out <- NULL
for(z in 1:10){
  url='http://book.daum.net/bestseller/list.do?categoryID=SP1KOR00000&ymd=2017082&cpId=KY&pageNo='
  hh <- htmlParse(paste0(url,z))
  a <- xpathSApply(hh, "//div[@class='wrap_bestseller_rest']/*/*/*/dt/a/strong", xmlValue)
  out <- cbind(out, a)
}