Python 为什么这个简单的urlopen()函数会生成ssl错误?

Python 为什么这个简单的urlopen()函数会生成ssl错误?,python,ssl,beautifulsoup,Python,Ssl,Beautifulsoup,我制作了一个程序,删除了简单的维基百科页面。我参考了《用python进行Web抓取》一书 from urllib.request import urlopen from bs4 import BeautifulSoup html = urlopen("http://en.wikipedia.org/wiki/Kevin_Bacon") bsObj = BeautifulSoup(html, 'html.parser') for link in bsObj.findAll("a"): if

我制作了一个程序,删除了简单的维基百科页面。我参考了《用python进行Web抓取》一书

from urllib.request import urlopen
from bs4 import BeautifulSoup

html = urlopen("http://en.wikipedia.org/wiki/Kevin_Bacon")
bsObj = BeautifulSoup(html, 'html.parser')

for link in bsObj.findAll("a"):
  if 'href' in link.attrs:
    print(link.attrs['href'])
但它会生成ssl错误:

Traceback (most recent call last):
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 1240, in do_open
    h.request(req.get_method(), req.selector, req.data, headers)
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 1083, in request
    self._send_request(method, url, body, headers)
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 1128, in _send_request
    self.endheaders(body)
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 1079, in endheaders
    self._send_output(message_body)
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 911, in _send_output
    self.send(msg)
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 854, in send
    self.connect()
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py", line 1237, in connect
    server_hostname=server_hostname)
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ssl.py", line 376, in wrap_socket
    _context=self)
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ssl.py", line 747, in __init__
    self.do_handshake()
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ssl.py", line 983, in do_handshake
    self._sslobj.do_handshake()
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ssl.py", line 628, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:646)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test.py", line 6, in <module>
    html = urlopen("http://en.wikipedia.org/wiki/Kevin_Bacon")
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 162, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 471, in open
    response = meth(req, response)
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 581, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 503, in error
    result = self._call_chain(*args)
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 443, in _call_chain
    result = func(*args)
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 686, in http_error_302
    return self.parent.open(new, timeout=req.timeout)
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 465, in open
    response = self._open(req, data)
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 483, in _open
    '_open', req)
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 443, in _call_chain
    result = func(*args)
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 1283, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "/usr/local/Cellar/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py", line 1242, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:646)>
回溯(最近一次呼叫最后一次):
文件“/usr/local/ceral/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py”,第1240行,打开
h、 请求(请求获取方法(),请求选择器,请求数据,标题)
请求中的文件“/usr/local/ceral/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py”,第1083行
self.\u发送请求(方法、url、正文、标题)
文件“/usr/local/ceral/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py”,第1128行,在发送请求中
self.endheaders(主体)
文件“/usr/local/ceral/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py”,第1079行,在endheaders中
自发送输出(消息体)
文件“/usr/local/ceral/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py”,第911行,输入
self.send(msg)
文件“/usr/local/ceral/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py”,第854行,在send中
self.connect()
文件“/usr/local/ceral/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/http/client.py”,第1237行,在connect中
服务器\主机名=服务器\主机名)
文件“/usr/local/ceral/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ssl.py”,第376行,在wrap_套接字中
_上下文=自身)
文件“/usr/local/ceral/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ssl.py”,第747行,在__
self.do_握手
文件“/usr/local/ceral/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ssl.py”,第983行,在do_握手中
赛尔夫:握手
文件“/usr/local/ceral/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/ssl.py”,第628行,在do_握手中
赛尔夫:握手
ssl.SSLError:[ssl:CERTIFICATE\u VERIFY\u FAILED]证书验证失败(\u ssl.c:646)
在处理上述异常期间,发生了另一个异常:
回溯(最近一次呼叫最后一次):
文件“test.py”,第6行,在
html=urlopen(“http://en.wikipedia.org/wiki/Kevin_Bacon")
文件“/usr/local/ceral/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py”,urlopen中第162行
返回opener.open(url、数据、超时)
文件“/usr/local/ceral/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py”,第471行,打开
响应=方法(请求,响应)
文件“/usr/local/ceral/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py”,第581行,在http_响应中
“http”、请求、响应、代码、消息、hdrs)
文件“/usr/local/ceral/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py”,第503行出错
结果=自身调用链(*args)
文件“/usr/local/ceral/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py”,第443行,在调用链中
结果=func(*args)
文件“/usr/local/ceral/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py”,第686行,http\u error\u 302
返回self.parent.open(新建,超时=请求超时)
打开文件“/usr/local/ceral/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py”,第465行
响应=自身打开(请求,数据)
文件“/usr/local/ceral/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py”,第483行,打开
"开放",
文件“/usr/local/ceral/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py”,第443行,在调用链中
结果=func(*args)
https_open文件“/usr/local/ceral/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py”,第1283行
上下文=self.\u上下文,检查主机名=self.\u检查主机名)
文件“/usr/local/ceral/python3/3.5.0/Frameworks/Python.framework/Versions/3.5/lib/python3.5/urllib/request.py”,第1242行,打开
引发URL错误(err)
urllib.error.urleror:
Python版本是3.5.0


这本书并没有处理这种错误。是否必须使用pip安装与ssl相关的软件包?

当服务器提供的ssl证书上列出的域与浏览器连接的域不匹配时,会发生ssl错误。要开始HTTPS会话,证书上的域必须与浏览器地址栏中的域完全匹配

  • 这意味着您试图打开的url使用

    urllib.request.urlopen(url)
    
    未返回正确的http响应。 现在,你可以试试这些东西之一-

  • 使用以下命令检查http/https代理处理程序

    urlib.request.proxyHandler()
    
  • 如果您认为url是安全的,并且不想确保它包含SSL身份验证,请尝试通过https服务器


  • 请不要试图刮维基百科页面。学着用它来代替。当然,这是一个学习曲线,但是您可以用JSON获得所需的数据,而不必用BeautifulSoup进行解析。@PM2Ring谢谢您的建议,但我也想知道为什么会发生此错误?我不太确定,因为我不使用
    urllib.request
    (我现在使用第三方模块处理HTTP内容)。但是
    http://en.wikipedia.org/wiki/Kevin_Bacon
    重定向到
    https://en.wikipedia.org/wiki/Kevin_Bacon
    ,这需要SSL。谈判。在3.3和3.4中都很好,但我没有3.5。你能试试其他网站吗?你能试试3.3或3.4吗?@dstudeba python 3.5太疯狂了。它工作在3.4。。。