Web scraping 使用Rvest获取表
我正在尝试清理表令牌需要作为名为Web scraping 使用Rvest获取表,web-scraping,rvest,Web Scraping,Rvest,我正在尝试清理表令牌需要作为名为x-xsrf-token的请求头发送,而不是通过传递到参数: 此外,令牌值可能会随着会话的变化而变化,因此您需要在cookie中获取它。然后,将数据转换为数据帧并得到结果: library(rvest) pg <- html_session("https://www.barchart.com/options/stocks-by-sector?page=1") cookies <- pg$response$cookies token
x-xsrf-token
的请求头发送,而不是通过传递到参数:
此外,令牌值可能会随着会话的变化而变化,因此您需要在cookie中获取它。然后,将数据转换为数据帧并得到结果:
library(rvest)
pg <- html_session("https://www.barchart.com/options/stocks-by-sector?page=1")
cookies <- pg$response$cookies
token <- URLdecode(dplyr::recode("XSRF-TOKEN", !!!setNames(cookies$value, cookies$name)))
pg <-
pg %>% rvest:::request_GET(
"https://www.barchart.com/proxies/core-api/v1/quotes/get?lists=stocks.optionable.by_sector.all.us&fields=symbol%2CsymbolName%2ClastPrice%2CpriceChange%2CpercentChange%2ChighPrice%2ClowPrice%2Cvolume%2CtradeTime%2CsymbolCode%2CsymbolType%2ChasOptions&orderBy=symbol&orderDir=asc&meta=field.shortName%2Cfield.type%2Cfield.description&hasOptions=true&page=1&limit=1000000&raw=1",
config = httr::add_headers(`x-xsrf-token` = token)
)
data_raw <- httr::content(pg$response)
data <-
purrr::map_dfr(
data_raw$data,
function(x){
as.data.frame(x$raw)
}
)
库(rvest)
pg