Python 为什么可以'；我不能从BeautifulSoup的网站上得到伤痕累累的数据吗？I'；我得到超时错误_Python_Beautifulsoup_Python Requests_Urllib2_Urlopen

Python 为什么可以'；我不能从BeautifulSoup的网站上得到伤痕累累的数据吗？I'；我得到超时错误

python

Python 为什么可以'；我不能从BeautifulSoup的网站上得到伤痕累累的数据吗？I'；我得到超时错误,python,beautifulsoup,python-requests,urllib2,urlopen,Python,Beautifulsoup,Python Requests,Urllib2,Urlopen,我试图从以下网站获取数据，但我得到的错误如下所示。PFB的代码相同 from urllib2 import urlopen import bs4 as bs response = urlopen('http://www.mec.ac.in/mec/stats2018.php') html = response.read() soup = bs.BeautifulSoup(response,'lxml') print soup.title PFB错误： Traceback (most recen

我试图从以下网站获取数据，但我得到的错误如下所示。PFB的代码相同

from urllib2 import urlopen
import bs4 as bs
response = urlopen('http://www.mec.ac.in/mec/stats2018.php')
html = response.read()
soup = bs.BeautifulSoup(response,'lxml')
print soup.title

PFB错误：

Traceback (most recent call last):
  File "et.py", line 3, in <module>
    response = urlopen('http://www.mec.ac.in/mec/stats2018.php')
  File "/usr/local/Cellar/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 154, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/local/Cellar/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 435, in open
    response = meth(req, response)
  File "/usr/local/Cellar/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 548, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/local/Cellar/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 473, in error
    return self._call_chain(*args)
  File "/usr/local/Cellar/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 407, in _call_chain
    result = func(*args)
  File "/usr/local/Cellar/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 556, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 403: Forbidden

回溯（最近一次呼叫最后一次）：
文件“et.py”，第3行，在
响应=urlopen（'http://www.mec.ac.in/mec/stats2018.php')
文件“/usr/local/cillar”/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”，urlopen第154行
返回opener.open（url、数据、超时）
文件“/usr/local/cillar”/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”，第435行，打开
响应=方法（请求，响应）
文件“/usr/local/cillar”/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”，第548行，在http_响应中
“http”、请求、响应、代码、消息、hdrs）
文件“/usr/local/cillar”/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”，第473行出错
返回自我。调用链（*args）
文件“/usr/local/cillar”/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”，第407行，在调用链中
结果=func（*args）
文件“/usr/local/cillar”/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”，第556行，默认为http_error_
raise HTTPError（请求获取完整url（），代码，消息，hdrs，fp）
urllib2.HTTPError:HTTP错误403:禁止

恢复此错误后如何检索数据？

服务器使用包含

Python-urllib

字符串（默认情况下

urllib2

urllib

发送）的

User-Agent

标头专门“阻止”请求：

[1]中的

：导入请求
在[2]中：url=”http://www.mec.ac.in/mec/stats2018.php"
在[3]中：requests.get（url，headers={'User-Agent'：'pythonURLLIB/2.6'}）
出[3]：
在[4]中：requests.get（url，headers={'User-Agent'：'Mozilla/5.0（Macintosh；Intel Mac OS X 10_13_6）AppleWebKit/537.36（KHTML，比如Gecko）Chrome/70.0.3538.77 Safari/537.36}）
出[4]：

服务器使用

用户代理

头专门“阻止”包含

Python urllib

字符串的请求（默认情况下，

urllib2

urllib

发送）：

[1]中的

：导入请求
在[2]中：url=”http://www.mec.ac.in/mec/stats2018.php"
在[3]中：requests.get（url，headers={'User-Agent'：'pythonURLLIB/2.6'}）
出[3]：
在[4]中：requests.get（url，headers={'User-Agent'：'Mozilla/5.0（Macintosh；Intel Mac OS X 10_13_6）AppleWebKit/537.36（KHTML，比如Gecko）Chrome/70.0.3538.77 Safari/537.36}）
出[4]：

不行，该网站断然拒绝您访问该资源。我使用请求（来自请求）测试了url，似乎效果很好（状态代码200）。好的，谢谢您的帮助。服务器不允许您获取此确切url。403（禁止）标题带有“超时错误”的误导性，问题中的回溯清楚地显示了访问权限错误。你不能，该网站断然拒绝你访问该资源。我用请求（来自请求）测试了url，似乎效果很好（状态代码200）好的，谢谢您的帮助服务器不允许您获取这个准确的URL。403（禁止）标题带有“超时错误”的误导性，问题中的回溯清楚地显示了访问权限错误。

In [1]: import requests

In [2]: url = "http://www.mec.ac.in/mec/stats2018.php"

In [3]: requests.get(url, headers={'User-Agent': 'Python-urllib/2.6'})
Out[3]: <Response [403]>

In [4]: requests.get(url, headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36'})
Out[4]: <Response [200]>