R htmltab";“未找到表”;解决办法?
我正试图从一个名为RealGM的网站上搜集一些(很多)NCAA男子篮球数据。我的代码如下:R htmltab";“未找到表”;解决办法?,r,R,我正试图从一个名为RealGM的网站上搜集一些(很多)NCAA男子篮球数据。我的代码如下: library(htmltab) tables <- list() for (i in 0:1548) { for (j in 0:16) { for (k in 0:4) { a <- i+1 b <- 2003+j
library(htmltab)
tables <- list()
for (i in 0:1548) {
for (j in 0:16) {
for (k in 0:4) {
a <- i+1
b <- 2003+j
c <- k+1
url <- paste("https://basketball.realgm.com/ncaa/conferences/Big-Ten-Conference/2/Michigan/",a,"/individual-games/",b,"/minutes/Season/desc/",c,sep = "")
tables[[paste(i,j,k,sep = "")]] <- htmltab(url,rm_nodata_cols = F,which = 1)
}
}
}
库(htmltab)
表一个选项是使用tryCatch
并跳过给出错误的URL
library(htmltab)
tables <- list()
for (i in 1:1549) {
for (j in 2003:2019) {
for (k in 1:5) {
url <- paste0("https://basketball.realgm.com/ncaa/conferences/Big-Ten-Conference/2/Michigan/",i,"/individual-games/",j,"/minutes/Season/desc/",k)
tables[[paste0(i,j,k)]] <- tryCatch({
htmltab(url,rm_nodata_cols = F,which = 1)
}, error = function(e) {
cat("Wrong URL : ", url, " skipping\n")
})
}
}
}
库(htmltab)
tablesI首先检查表是否存在,如果不存在,则转到循环的下一次迭代,从而找出如何执行此操作:
library(htmltab)
tables <- list()
for (i in 0:1548) {
for (j in 0:16) {
for (k in 0:4) {
a <- i+1
b <- 2003+j
c <- k+1
url <- paste("https://basketball.realgm.com/ncaa/conferences/Big-Ten-Conference/2/Michigan/",a,"/individual-games/",b,"/minutes/Season/desc/",c,sep = "")
test <- html_nodes(read_html(url),"table")
if (length(test) == 0){
next
}
tables[[paste(i,j,k,sep = "")]] <- htmltab(url,rm_nodata_cols = F,which = 1)
}
}
}
库(htmltab)
谢谢!我曾研究过tryCatch,但在语法方面有问题。谢谢你!