Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/313.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 提取html的特定部分_Python_Html_Web Scraping_Beautifulsoup_Python Requests Html - Fatal编程技术网

Python 提取html的特定部分

Python 提取html的特定部分,python,html,web-scraping,beautifulsoup,python-requests-html,Python,Html,Web Scraping,Beautifulsoup,Python Requests Html,我正在使用html请求和BeautifulSoup开发一个webscraper(这是新的)。对于1个网页(),我正在尝试刮取一个部件,我将为其他产品复制该部件。html看起来像: <div class="plp-listing-load-status c-list-header__counter initialized" data-page-number="1" data-total-pages-count="57" data-p

我正在使用html请求和BeautifulSoup开发一个webscraper(这是新的)。对于1个网页(),我正在尝试刮取一个部件,我将为其他产品复制该部件。html看起来像:

<div class="plp-listing-load-status c-list-header__counter initialized" data-page-number="1" data-total-pages-count="57" data-products-count="60" data-total-products-count="3361" data-status-format="{available}/{total} results">60/3361 results</div>


但两者都返回
None
。我不知道如何具体选择57。任何帮助都将被告知

要获取总页数,您可以使用以下示例:

import requests
from bs4 import BeautifulSoup


url = "https://www.selfridges.com/GB/en/cat/beauty/make-up/?pn=1"

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:87.0) Gecko/20100101 Firefox/87.0"
}
soup = BeautifulSoup(requests.get(url, headers=headers).text, "html.parser")
print(soup.select_one("[data-total-pages-count]")["data-total-pages-count"])
印刷品:

56
import requests
from bs4 import BeautifulSoup


url = "https://www.selfridges.com/GB/en/cat/beauty/make-up/?pn=1"

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:87.0) Gecko/20100101 Firefox/87.0"
}
soup = BeautifulSoup(requests.get(url, headers=headers).text, "html.parser")
print(soup.select_one("[data-total-pages-count]")["data-total-pages-count"])
56