Python BeautifulSoup HTTPResponse没有属性encode_Python_Python 3.x_Beautifulsoup_Urlopen

Python BeautifulSoup HTTPResponse没有属性encode

python python-3.x

Python BeautifulSoup HTTPResponse没有属性encode,python,python-3.x,beautifulsoup,urlopen,Python,Python 3.x,Beautifulsoup,Urlopen,我试图让beautifulsoup使用URL，如下所示： from urllib.request import urlopen from bs4 import BeautifulSoup html = urlopen("http://proxies.org") soup = BeautifulSoup(html.encode("utf-8"), "html.parser") print(soup.find_all('a')) 但是，我得到了一个错误： File "c:\Python3\Pro

我试图让beautifulsoup使用URL，如下所示：

from urllib.request import urlopen
from bs4 import BeautifulSoup
html = urlopen("http://proxies.org")
soup = BeautifulSoup(html.encode("utf-8"), "html.parser")
print(soup.find_all('a'))

但是，我得到了一个错误：

 File "c:\Python3\ProxyList.py", line 3, in <module>
    html = urlopen("http://proxies.org").encode("utf-8")
AttributeError: 'HTTPResponse' object has no attribute 'encode'

文件“c:\Python3\ProxyList.py”，第3行，在
html=urlopen（“http://proxies.org）编码（“utf-8”）
AttributeError:“HTTPResponse”对象没有属性“encode”

知道为什么吗？这可能与urlopen函数有关吗？为什么需要utf-8

显然，在给出的示例（现在似乎过时或错误）方面，Python 3和BeautifulSoup4似乎存在一些差异。

它不起作用，因为

urlopen

返回一个HTTPResponse对象，而您将其视为纯HTML。您需要在响应上链接

.read（）

方法，以获取HTML：

response = urlopen("http://proxies.org")
html = response.read()
soup = BeautifulSoup(html.decode("utf-8"), "html.parser")
print (soup.find_all('a'))

您可能还想使用

html.decode（“utf-8”）

，而不是

html.encode（“utf-8”）

检查此项

soup = BeautifulSoup(html.read().encode('utf-8'),"html.parser")

首先，

urlopen

将返回一个类似文件的对象

BeautifulSoup

可以接受类似文件的对象并自动解码，您不必担心

要解析文档，请将其传递到BeautifulSoup构造函数中您可以传入字符串或打开的文件句柄：

from bs4 import BeautifulSoup

soup = BeautifulSoup(open("index.html"))

soup = BeautifulSoup("<html>data</html>")

从bs4导入美化组
soup=BeautifulSoup（打开（“index.html”））
汤=美化组（“数据”）

首先，文档被转换为Unicode，HTML实体被转换为Unicode字符

嗨，Josh，这对我来说仍然不起作用，我使用的代码与你的代码完全相同，它给了我一个“字符映射到”错误这最终是需要的解决方案-

from bs4 import BeautifulSoup

soup = BeautifulSoup(open("index.html"))

soup = BeautifulSoup("<html>data</html>")