Web scraping 使用Rvest获取表

Web scraping 使用Rvest获取表,web-scraping,rvest,Web Scraping,Rvest,我正在尝试清理表令牌需要作为名为x-xsrf-token的请求头发送,而不是通过传递到参数: 此外,令牌值可能会随着会话的变化而变化,因此您需要在cookie中获取它。然后,将数据转换为数据帧并得到结果: library(rvest) pg <- html_session("https://www.barchart.com/options/stocks-by-sector?page=1") cookies <- pg$response$cookies token

我正在尝试清理表

令牌需要作为名为
x-xsrf-token
请求头发送,而不是通过传递到参数: 此外,令牌值可能会随着会话的变化而变化,因此您需要在cookie中获取它。然后,将数据转换为数据帧并得到结果:

library(rvest)
pg <- html_session("https://www.barchart.com/options/stocks-by-sector?page=1")
cookies <- pg$response$cookies
token <- URLdecode(dplyr::recode("XSRF-TOKEN", !!!setNames(cookies$value, cookies$name)))
pg <- 
  pg %>% rvest:::request_GET(
    "https://www.barchart.com/proxies/core-api/v1/quotes/get?lists=stocks.optionable.by_sector.all.us&fields=symbol%2CsymbolName%2ClastPrice%2CpriceChange%2CpercentChange%2ChighPrice%2ClowPrice%2Cvolume%2CtradeTime%2CsymbolCode%2CsymbolType%2ChasOptions&orderBy=symbol&orderDir=asc&meta=field.shortName%2Cfield.type%2Cfield.description&hasOptions=true&page=1&limit=1000000&raw=1",
    config = httr::add_headers(`x-xsrf-token` = token)
  )
data_raw <- httr::content(pg$response)
data <- 
  purrr::map_dfr(
    data_raw$data,
    function(x){
      as.data.frame(x$raw)
    }
  )
库(rvest)
pg