如何使用python获取404错误页面的HTML内容？_Python_Python 3.x_Exception_Web Scraping_Beautifulsoup

如何使用python获取404错误页面的HTML内容？

python python-3.x exception web-scraping

如何使用python获取404错误页面的HTML内容？,python,python-3.x,exception,web-scraping,beautifulsoup,Python,Python 3.x,Exception,Web Scraping,Beautifulsoup,我使用python从一个URL的多个页面获取HTML数据。我发现urllib在URL不存在时抛出异常。我如何检索自定义404错误页面的HTML（该页面显示类似“未找到页面”的内容）当前代码： try: req = Request(URL, headers={'User-Agent': 'Mozilla/5.0'}) client = urlopen(req) #downloading html data page_html = client.read()

我使用python从一个URL的多个页面获取HTML数据。我发现urllib在URL不存在时抛出异常。我如何检索自定义404错误页面的HTML（该页面显示类似“未找到页面”的内容）

当前代码：

try:
    req = Request(URL, headers={'User-Agent': 'Mozilla/5.0'})
    client = urlopen(req)

    #downloading html data
    page_html = client.read()

    #closing connection
    client.close()
except:
    print("The following URL was not found. Program terminated.\n" + URL)
    break

你去过图书馆吗

只需使用pip安装库

pip install requests

像这样使用它

import requests

response = requests.get('https://stackoverflow.com/nonexistent_path')
print(response.status_code) # 404
print(response.text) # Prints the raw HTML response

你去过图书馆吗

只需使用pip安装库

pip install requests

像这样使用它

import requests

response = requests.get('https://stackoverflow.com/nonexistent_path')
print(response.status_code) # 404
print(response.text) # Prints the raw HTML response

看见它有一个返回响应内容的

.read（）

方法。请参阅。它有一个返回响应内容的

.read（）

方法。