使用R在特定URL上刮取多个节点

使用R在特定URL上刮取多个节点,r,xml,web-scraping,R,Xml,Web Scraping,我很难从一个葡萄酒价格列表网站上找到卖家的价格。我只得到第一个结果,但没有得到接下来的结果 我当前的循环将返回每页的第一个价格,然后转到我的URL列表定义的下一页。这是迄今为止我所拥有的: #clean.df$srchwrds contains 1,000+ search phrases which I've pre-defined. urls = lapply(clean.df$srchwrds, . %>% paste("http://www.wine-searcher.com/fi

我很难从一个葡萄酒价格列表网站上找到卖家的价格。我只得到第一个结果,但没有得到接下来的结果

我当前的循环将返回每页的第一个价格,然后转到我的URL列表定义的下一页。这是迄今为止我所拥有的:

#clean.df$srchwrds contains 1,000+ search phrases which I've pre-defined.

urls = lapply(clean.df$srchwrds, . %>% paste("http://www.wine-searcher.com/find/",.,"/", sep = ""))

out = pblapply(urls, function(x) {
    print(x)
    page = read_html(x)
    temp = page %>% html_nodes('.offer_price')
    out = temp
    return (out)
})
例如,您会注意到此URL上有多个结果卖家:

我的脚本将以它遇到的第一个卖家的价格为基础,忽略其余的。返回第一个卖家的价格后,它将转到URL列表中定义的下一个URL

我希望它在继续之前返回每页的所有价格


提前感谢您的帮助

根据输出的不同,有多种方法可以实现您想要的 需要

在第一个示例中,返回一个列表,每个列表都有一个字符向量 页顺便说一句,这和你做的很相似,不知道你在做什么 是

library(rvest)

## Loading required package: xml2

baseUrl <- 'http://www.wine-searcher.com/find/'

srchwrds <- c("chateau+petrus+chateau+petrus+2014", 
              "chateau+petrus+chateau+petrus+2015")


result <- sapply(srchwrds, function(x) {
    paste0(baseUrl, x) %>% 
        read_html() %>% 
        html_nodes('.offer_price') %>% 
        html_attr('content') 
})
result 

## $`chateau+petrus+chateau+petrus+2014`
##  [1] "1618.35" "1622.06" "1622.98" "1676.47" "1800.00" "1854.83" "2133.06"
##  [8] "3385.08" "4542.50" "9517.24" "9517.24"
## 
## $`chateau+petrus+chateau+petrus+2015`
##  [1] "2264.71"  "2499.40"  "2500.00"  "2550.40"  "2577.59"  "2735.89" 
##  [7] "2777.62"  "2782.25"  "2840.00"  "5096.17"  "10665.32" "21098.79"
library(purrr)

result <- map_df(srchwrds, ~{

    paste0(baseUrl, .x) %>% 
        read_html() %>% 
        html_nodes('[itemprop="offers"]') -> tmp
    price <- tmp %>% 
        html_nodes('.offer_price') %>% 
        html_attr('content') 
    seller <- tmp %>% 
        html_nodes('.seller-link-wrap') %>% 
        html_text() %>% 
        gsub('\n', '', ., fixed = T)
    data.frame( seller = seller, price = price, stringsAsFactors = F)
})

result

##                             seller    price
## 1            JJ Buckley Fine Wines  1618.35
## 2               K&L Wine Merchants  1622.06
## 3                Morrell & Company  1622.98
## 4  Weinemotionen - KK Handels GmbH  1676.47
## 5                 Vins Grands Crus  1800.00
## 6     Zachys Wine and Liquor, Inc.  1854.83
## 7                             Arvi  2133.06
## 8                Morrell & Company  3385.08
## 9                      Cellar & Co  4542.50
## 10                Vinum Fine Wines  9517.24
## 11                Vinum Fine Wines  9517.24
## 12                Bacchus-Vinothek  2264.71
## 13           JJ Buckley Fine Wines  2499.40
## 14                Vins Grands Crus  2500.00
## 15               Morrell & Company  2550.40
## 16                Vinum Fine Wines  2577.59
## 17        Fine Wines International  2735.89
## 18                  Sherry-Lehmann  2777.62
## 19              K&L Wine Merchants  2782.25
## 20                      FinestWine  2840.00
## 21           JJ Buckley Fine Wines  5096.17
## 22               Morrell & Company 10665.32
## 23           JJ Buckley Fine Wines 21098.79