Python 读取/解析URL';列表时发生http.client.RemoteDisconnected错误;s
我正在开发一个简单的url解析器:其思想是将url放在一列中,尝试解析它并打印出它重定向到的位置的输出 我的基本功能正常工作,但它偶尔抛出http.client.RemoteDisconnected异常,程序停止:抛出一些错误(如下)Python 读取/解析URL';列表时发生http.client.RemoteDisconnected错误;s,python,python-3.x,urlopen,Python,Python 3.x,Urlopen,我正在开发一个简单的url解析器:其思想是将url放在一列中,尝试解析它并打印出它重定向到的位置的输出 我的基本功能正常工作,但它偶尔抛出http.client.RemoteDisconnected异常,程序停止:抛出一些错误(如下) 解决:其实很简单: 添加 http.client.HTTPException 。在python2中是这样的 httplib.HTTPException作为e: 即 Traceback (most recent call last): File "URLIFIE
解决:其实很简单: 添加 http.client.HTTPException 。在python2中是这样的 httplib.HTTPException作为e: 即
Traceback (most recent call last):
File "URLIFIER.py", line 43, in <module>
row.append(urlparse(row[0]))
File "URLIFIER.py", line 12, in urlparse
conn = urllib.request.urlopen(urlColumnElem,timeout=8)
File "//anaconda/lib/python3.5/urllib/request.py", line 163, in urlopen
return opener.open(url, data, timeout)
File "//anaconda/lib/python3.5/urllib/request.py", line 466, in open
response = self._open(req, data)
File "//anaconda/lib/python3.5/urllib/request.py", line 484, in _open
'_open', req)
File "//anaconda/lib/python3.5/urllib/request.py", line 444, in _call_chain
result = func(*args)
File "//anaconda/lib/python3.5/urllib/request.py", line 1282, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "//anaconda/lib/python3.5/urllib/request.py", line 1257, in do_open
r = h.getresponse()
File "//anaconda/lib/python3.5/http/client.py", line 1197, in getresponse
response.begin()
File "//anaconda/lib/python3.5/http/client.py", line 297, in begin
version, status, reason = self._read_status()
File "//anaconda/lib/python3.5/http/client.py", line 266, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
def urlparse(urlColumnElem):
try:
#default timeout is 8 seconds.
conn = urllib.request.urlopen(urlColumnElem,timeout=8)
redirect=conn.geturl()
#check redirect
if(redirect == urlColumnElem):
#print ("same: ")
#print(redirect)
return (redirect)
else:
#print("Not the same url ")
return(redirect)
#catch all the exceptions
except urllib.error.HTTPError as e:
return (e.code)
except urllib.error.URLError as e:
return ('URL_Error')
except socket.timeout as e:
return ("timeout")
def urlparse(urlColumnElem):
try:
#default timeout is 8 seconds.
conn = urllib.request.urlopen(urlColumnElem,timeout=8)
redirect=conn.geturl()
#check redirect
if(redirect == urlColumnElem):
#print ("same: ")
#print(redirect)
return (redirect)
else:
#print("Not the same url ")
return(redirect)
#catch all the exceptions
except urllib.error.HTTPError as e:
return (e.code)
except urllib.error.URLError as e:
return ('URL_Error')
except socket.timeout as e:
return ("timeout")
except http.client.HTTPException as e:
return("HTTPException")