Web scraping 在不手动获取页数的情况下刮取多页_Web Scraping_Beautifulsoup

Web scraping 在不手动获取页数的情况下刮取多页

web-scraping

Web scraping 在不手动获取页数的情况下刮取多页,web-scraping,beautifulsoup,Web Scraping,Beautifulsoup,我们目前正忙于一个属性web刮取，并试图在不手动获取页面范围的情况下刮取多个页面（共有5个页面）对于范围（0,5）中的num: url=”https://www.property24.com/for-sale/woodland-hills-wildlife-estate/bloemfontein/free-state/10467/p“+str（num）如何在不手动键入页面范围的情况下输出所有页面的URL 输出可能使用ul class=“pagination”来计算页码？您可以使用pag

我们目前正忙于一个属性web刮取，并试图在不手动获取页面范围的情况下刮取多个页面（共有5个页面）

对于范围（0,5）中的num:

url=”https://www.property24.com/for-sale/woodland-hills-wildlife-estate/bloemfontein/free-state/10467/p“+str（num）

如何在不手动键入页面范围的情况下输出所有页面的URL

输出

可能使用ul class=“pagination”来计算页码？

您可以使用pagination类来获取最后一个

标记，从中可以获取数据页码
，然后使用它获取所有链接。按照下面的代码来完成
代码：
输出：
如果您有任何问题，请告诉我：）
import requests
from bs4 import BeautifulSoup

#url="https://www.property24.com/for-sale/woodland-hills-wildlife-estate/bloemfontein/free-state/10467"
url="https://www.property24.com/for-sale/woodstock/cape-town/western-cape/10164"
data=requests.get(url)
soup=BeautifulSoup(data.content,"html.parser")
noofpages=soup.find("ul",{"class":"pagination"}).find_all("a")[-1]["data-pagenumber"]
for i in range(1,int(noofpages)+1):
    print(f"{url}/p{i}")