Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/assembly/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 为什么可以';我不能从BeautifulSoup的网站上得到伤痕累累的数据吗?I';我得到超时错误_Python_Beautifulsoup_Python Requests_Urllib2_Urlopen - Fatal编程技术网

Python 为什么可以';我不能从BeautifulSoup的网站上得到伤痕累累的数据吗?I';我得到超时错误

Python 为什么可以';我不能从BeautifulSoup的网站上得到伤痕累累的数据吗?I';我得到超时错误,python,beautifulsoup,python-requests,urllib2,urlopen,Python,Beautifulsoup,Python Requests,Urllib2,Urlopen,我试图从以下网站获取数据,但我得到的错误如下所示。PFB的代码相同 from urllib2 import urlopen import bs4 as bs response = urlopen('http://www.mec.ac.in/mec/stats2018.php') html = response.read() soup = bs.BeautifulSoup(response,'lxml') print soup.title PFB错误: Traceback (most recen

我试图从以下网站获取数据,但我得到的错误如下所示。PFB的代码相同

from urllib2 import urlopen
import bs4 as bs
response = urlopen('http://www.mec.ac.in/mec/stats2018.php')
html = response.read()
soup = bs.BeautifulSoup(response,'lxml')
print soup.title
PFB错误:

Traceback (most recent call last):
  File "et.py", line 3, in <module>
    response = urlopen('http://www.mec.ac.in/mec/stats2018.php')
  File "/usr/local/Cellar/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 154, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/local/Cellar/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 435, in open
    response = meth(req, response)
  File "/usr/local/Cellar/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 548, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/local/Cellar/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 473, in error
    return self._call_chain(*args)
  File "/usr/local/Cellar/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 407, in _call_chain
    result = func(*args)
  File "/usr/local/Cellar/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 556, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 403: Forbidden
回溯(最近一次呼叫最后一次):
文件“et.py”,第3行,在
响应=urlopen('http://www.mec.ac.in/mec/stats2018.php')
文件“/usr/local/cillar”/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”,urlopen第154行
返回opener.open(url、数据、超时)
文件“/usr/local/cillar”/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”,第435行,打开
响应=方法(请求,响应)
文件“/usr/local/cillar”/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”,第548行,在http_响应中
“http”、请求、响应、代码、消息、hdrs)
文件“/usr/local/cillar”/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”,第473行出错
返回自我。调用链(*args)
文件“/usr/local/cillar”/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”,第407行,在调用链中
结果=func(*args)
文件“/usr/local/cillar”/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py”,第556行,默认为http_error_
raise HTTPError(请求获取完整url(),代码,消息,hdrs,fp)
urllib2.HTTPError:HTTP错误403:禁止
恢复此错误后如何检索数据?

服务器使用包含
Python-urllib
字符串(默认情况下
urllib2
/
urllib
发送)的
User-Agent
标头专门“阻止”请求:

[1]中的
:导入请求
在[2]中:url=”http://www.mec.ac.in/mec/stats2018.php"
在[3]中:requests.get(url,headers={'User-Agent':'pythonURLLIB/2.6'})
出[3]:
在[4]中:requests.get(url,headers={'User-Agent':'Mozilla/5.0(Macintosh;Intel Mac OS X 10_13_6)AppleWebKit/537.36(KHTML,比如Gecko)Chrome/70.0.3538.77 Safari/537.36})
出[4]:
服务器使用
用户代理
头专门“阻止”包含
Python urllib
字符串的请求(默认情况下,
urllib2
/
urllib
发送):

[1]中的
:导入请求
在[2]中:url=”http://www.mec.ac.in/mec/stats2018.php"
在[3]中:requests.get(url,headers={'User-Agent':'pythonURLLIB/2.6'})
出[3]:
在[4]中:requests.get(url,headers={'User-Agent':'Mozilla/5.0(Macintosh;Intel Mac OS X 10_13_6)AppleWebKit/537.36(KHTML,比如Gecko)Chrome/70.0.3538.77 Safari/537.36})
出[4]:

不行,该网站断然拒绝您访问该资源。我使用请求(来自请求)测试了url,似乎效果很好(状态代码200)。好的,谢谢您的帮助。服务器不允许您获取此确切url。403(禁止)标题带有“超时错误”的误导性,问题中的回溯清楚地显示了访问权限错误。你不能,该网站断然拒绝你访问该资源。我用请求(来自请求)测试了url,似乎效果很好(状态代码200)好的,谢谢您的帮助服务器不允许您获取这个准确的URL。403(禁止)标题带有“超时错误”的误导性,问题中的回溯清楚地显示了访问权限错误。
In [1]: import requests

In [2]: url = "http://www.mec.ac.in/mec/stats2018.php"

In [3]: requests.get(url, headers={'User-Agent': 'Python-urllib/2.6'})
Out[3]: <Response [403]>

In [4]: requests.get(url, headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36'})
Out[4]: <Response [200]>