R 错误：无效的下标类型'；列表'；（废话）_R_List_Web Scraping

R 错误：无效的下标类型'；列表'；（废话）

r list web-scraping

R 错误：无效的下标类型'；列表'；（废话）,r,list,web-scraping,R,List,Web Scraping,我正在尝试从以下url中获取数据-：我想点击每个学院的名字，并获得每个学院的具体数据首先，我收集了一个矢量中的所有学院URL-： #loading the package: library(xml2) library(rvest) library(stringr) library(dplyr) #Specifying the url for desired website to be scrapped baseurl <- "https://university.careers360

我正在尝试从以下url中获取数据-：我想点击每个学院的名字，并获得每个学院的具体数据

首先，我收集了一个矢量中的所有学院URL-：

#loading the package:
library(xml2)
library(rvest)
library(stringr)
library(dplyr)

#Specifying the url for desired website to be scrapped
baseurl <- "https://university.careers360.com/colleges/list-of-degree-colleges-in-India"

#Reading the html content from Amazon
basewebpage <- read_html(baseurl)

#Extracting college name and its url
scraplinks <- function(url){
   #Create an html document from the url
   webpage <- xml2::read_html(url)
   #Extract the URLs
   url_ <- webpage %>%
   rvest::html_nodes(".title a") %>%
   rvest::html_attr("href")  
   #Extract the link text
   link_ <- webpage %>%
   rvest::html_nodes(".title a") %>%
   rvest::html_text()
   return(data_frame(link = link_, url = url_))
}

#College names and Urls
allcollegeurls<-scraplinks(baseurl)

#Reading the each url
library(purrr)    
allreadurls<-map(allcollegeurls$url, read_html)

#加载包：
库（xml2）
图书馆（rvest）
图书馆（stringr）
图书馆（dplyr）
#指定要废弃的所需网站的url
baseurl我不确定刮取的内容本身，但您可能希望用
for (i in 1:length(allreadurls)) {
  allcollegeurls$Specialization[i] <- html_nodes(allreadurls[i][],'td:nth-child(1)')
}

最后，由于allreadurls
是一个列表，因此您希望使用[[i]]
而不是[i]
（它再次返回一个列表）对其进行子集设置。最后，没有必要使用[]
我不确定刮取的内容本身，但您可能希望用
for (i in 1:length(allreadurls)) {
  allcollegeurls$Specialization[i] <- html_nodes(allreadurls[i][],'td:nth-child(1)')
}

最后，由于allreadurls
是一个列表，因此您希望使用[[i]]
而不是[i]
（它再次返回一个列表）对其进行子集设置。最后，不需要[]

'td:nth- 
  child(1)'