Web抓取:Xpath代码返回;脚本越界消息“;在R
我试图在for循环中刮取一个表,但是代码返回以下错误,“原始红利中的错误[[1]]:下标超出范围”。我怀疑Xpath可能不正确,但是我从HTML代码中尝试了许多类似的Xpath代码,但都没有成功。如有任何建议,将不胜感激。代码如下:Web抓取:Xpath代码返回;脚本越界消息“;在R,r,xpath,web-scraping,R,Xpath,Web Scraping,我试图在for循环中刮取一个表,但是代码返回以下错误,“原始红利中的错误[[1]]:下标超出范围”。我怀疑Xpath可能不正确,但是我从HTML代码中尝试了许多类似的Xpath代码,但都没有成功。如有任何建议,将不胜感激。代码如下: library(jsonlite) library(rvest) url = "https://www.dividenddata.co.uk/ftse-dividend-history.py?market=ftse100" xpath = '/html/bod
library(jsonlite)
library(rvest)
url = "https://www.dividenddata.co.uk/ftse-dividend-history.py?market=ftse100"
xpath = '/html/body/section/div[3]/div[1]/div/table'
url_html <- read_html(url)
Stock_Table <- url_html %>% html_nodes(xpath = xpath) %>% html_table(fill = T)
Stock_Table <- Stock_Table[[1]]
Stock_Table <- Stock_Table[,c(1,2)]
Dividends <- data.frame()
for (i in 1:nrow(Stock_Table)) {
url = paste0("https://www.dividenddata.co.uk/dividendhistory.py?epic=",Stock_Table[1,1])
xpath = '/html/body/section/div[3]/div/div/div[3]/div[1]/div'
Raw_Dividends <- url_html %>% html_nodes(xpath = xpath) %>%
html_table(fill = T)
Raw_Dividends <- Raw_Dividends[[1]]
Raw_Dividends$Symbol <- rep(Stock_Table[1,i],nrow(Stock_Table))
Dividends <- rbind(Dividends, Raw_Dividends)
}
library(jsonlite)
图书馆(rvest)
url=”https://www.dividenddata.co.uk/ftse-dividend-history.py?market=ftse100"
xpath='/html/body/section/div[3]/div[1]/div/table'
url\u html%html\u表格(fill=T)
Stock_Table您在url=paste0(…”,Stock_Table[1,1])行中有几个问题。
循环仅引用第一个Stock,您还需要在循环中替换为=”,Stock_Table[i,1])
,您缺少对新创建的URL的read_html
调用。谢谢Dave2e-这就解决了问题。