Python 读取/解析URL';列表时发生http.client.RemoteDisconnected错误;s

Python 读取/解析URL';列表时发生http.client.RemoteDisconnected错误;s,python,python-3.x,urlopen,Python,Python 3.x,Urlopen,我正在开发一个简单的url解析器:其思想是将url放在一列中,尝试解析它并打印出它重定向到的位置的输出 我的基本功能正常工作,但它偶尔抛出http.client.RemoteDisconnected异常,程序停止:抛出一些错误(如下) 解决:其实很简单: 添加 http.client.HTTPException 。在python2中是这样的 httplib.HTTPException作为e: 即 Traceback (most recent call last): File "URLIFIE

我正在开发一个简单的url解析器:其思想是将url放在一列中,尝试解析它并打印出它重定向到的位置的输出

我的基本功能正常工作,但它偶尔抛出http.client.RemoteDisconnected异常,程序停止:抛出一些错误(如下)


解决:其实很简单:

添加

http.client.HTTPException

。在python2中是这样的

httplib.HTTPException作为e:

Traceback (most recent call last):
  File "URLIFIER.py", line 43, in <module>
    row.append(urlparse(row[0]))
  File "URLIFIER.py", line 12, in urlparse
    conn = urllib.request.urlopen(urlColumnElem,timeout=8)
  File "//anaconda/lib/python3.5/urllib/request.py", line 163, in urlopen
    return opener.open(url, data, timeout)
  File "//anaconda/lib/python3.5/urllib/request.py", line 466, in open
    response = self._open(req, data)
  File "//anaconda/lib/python3.5/urllib/request.py", line 484, in _open
    '_open', req)
  File "//anaconda/lib/python3.5/urllib/request.py", line 444, in _call_chain
    result = func(*args)
  File "//anaconda/lib/python3.5/urllib/request.py", line 1282, in http_open
    return self.do_open(http.client.HTTPConnection, req)
  File "//anaconda/lib/python3.5/urllib/request.py", line 1257, in do_open
    r = h.getresponse()
  File "//anaconda/lib/python3.5/http/client.py", line 1197, in getresponse
    response.begin()
  File "//anaconda/lib/python3.5/http/client.py", line 297, in begin
    version, status, reason = self._read_status()
  File "//anaconda/lib/python3.5/http/client.py", line 266, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
def urlparse(urlColumnElem):
    try:
        #default timeout is 8 seconds.
        conn = urllib.request.urlopen(urlColumnElem,timeout=8)
        redirect=conn.geturl()
        #check redirect
        if(redirect == urlColumnElem):
            #print ("same: ")
            #print(redirect)
            return (redirect)
        else:
            #print("Not the same url ")
            return(redirect)
    #catch all the exceptions
    except urllib.error.HTTPError as e:
        return (e.code)
    except urllib.error.URLError as e:
        return ('URL_Error')
    except socket.timeout as e:
        return ("timeout")
def urlparse(urlColumnElem):
    try:
        #default timeout is 8 seconds.
        conn = urllib.request.urlopen(urlColumnElem,timeout=8)
        redirect=conn.geturl()
        #check redirect
        if(redirect == urlColumnElem):
            #print ("same: ")
            #print(redirect)
            return (redirect)
        else:
            #print("Not the same url ")
            return(redirect)
    #catch all the exceptions
    except urllib.error.HTTPError as e:
        return (e.code)
    except urllib.error.URLError as e:
        return ('URL_Error')
    except socket.timeout as e:
        return ("timeout")
    except  http.client.HTTPException as e:
        return("HTTPException")