Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 尽管使用quote_plus,但从包含非ascii字符的URL读取时遇到问题_Python_Python 3.x_Unicode_Urllib_Urllib2 - Fatal编程技术网

Python 尽管使用quote_plus,但从包含非ascii字符的URL读取时遇到问题

Python 尽管使用quote_plus,但从包含非ascii字符的URL读取时遇到问题,python,python-3.x,unicode,urllib,urllib2,Python,Python 3.x,Unicode,Urllib,Urllib2,注意注释中列出的链接是针对Python2.7的,但是这个问题与Python3.7有关 我正在使用Python3.7和Django。我想从字符串中包含特殊字符的URL读取,但在尝试传统方法时会出错 >>> url = "https://www.supergaming.com/f/gaming/article/pvmqe/was_browsing_the_steam_app_reviews_and_ಠ_ಠ/" ... >>> html = urllib2.urlo

注意注释中列出的链接是针对Python2.7的,但是这个问题与Python3.7有关

我正在使用Python3.7和Django。我想从字符串中包含特殊字符的URL读取,但在尝试传统方法时会出错

>>> url = "https://www.supergaming.com/f/gaming/article/pvmqe/was_browsing_the_steam_app_reviews_and_ಠ_ಠ/"
...
>>> html = urllib2.urlopen(req, 5000).read()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 543, in _open
    '_open', req)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 503, in _call_chain
    result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 1360, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 1317, in do_open
    encode_chunked=req.has_header('Transfer-encoding'))
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1229, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1240, in _send_request
    self.putrequest(method, url, **skips)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1107, in putrequest
    self._output(request.encode('ascii'))
UnicodeEncodeError: 'ascii' codec can't encode character '\u0ca0' in position 69: ordinal not in range(128)
>url=”https://www.supergaming.com/f/gaming/article/pvmqe/was_browsing_the_steam_app_reviews_and_ಠ_ಠ/"
...
>>>html=urllib2.urlopen(req,5000).read()
回溯(最近一次呼叫最后一次):
文件“”,第1行,在
urlopen中的文件“/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py”,第222行
返回opener.open(url、数据、超时)
文件“/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py”,第525行,打开
响应=自身打开(请求,数据)
文件“/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py”,第543行,打开
"开放",
文件“/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py”,第503行,在调用链中
结果=func(*args)
https_open中的文件“/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py”,第1360行
上下文=self.\u上下文,检查主机名=self.\u检查主机名)
dou open中的文件“/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py”,第1317行
encode_chunked=req.has_头('Transfer-encoding'))
请求中的文件“/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py”,第1229行
self.\u发送\u请求(方法、url、正文、标题、编码\u分块)
文件“/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py”,第1240行,在发送请求中
putrequest(方法、url、**跳过)
putrequest中的文件“/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py”,第1107行
self.\u输出(request.encode('ascii'))
UnicodeEncodeError:“ascii”编解码器无法对第69位的字符“\u0ca0”进行编码:序号不在范围内(128)
所以我尝试了这里推荐的解决方案-,但我仍然无法读取URL

>>> urllib.parse.quote_plus(url)
'https%3A%2F%2Fwww.supergaming.com%2Ff%2Fgaming%2Farticle%2Fpvmqe%2Fwas_browsing_the_steam_app_reviews_and_%E0%B2%A0_%E0%B2%A0%2F'
>>> req = urllib2.Request(urllib.parse.quote_plus(url))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 328, in __init__
    self.full_url = url
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 354, in full_url
    self._parse()
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py", line 383, in _parse
    raise ValueError("unknown url type: %r" % self.full_url)
ValueError: unknown url type: 'https%3A%2F%2Fwww.supergaming.com%2Fr%2Fgaming%2Farticle%2Fpvmqe%2Fwas_browsing_the_steam_app_reviews_and_%E0%B2%A0_%E0%B2%A0%2F'
>urllib.parse.quote\u plus(url)
'https%3A%2F%2Fwww.supergaming.com%2Ff%2Fgaming%2facts%2Fpvmqe%2Fwas\u浏览\u steam\u应用程序\u评论\u和\uu%E0%B2%A0\u0%E0%B2%A0%2F'
>>>req=urllib2.Request(urllib.parse.quote_plus(url))
回溯(最近一次呼叫最后一次):
文件“”,第1行,在
文件“/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py”,第328行,在__
self.full_url=url
文件“/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py”,第354行,完整url
self._parse()
文件“/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py”,第383行,在
提升值错误(“未知url类型:%r”%self.full\u url)
ValueError:未知url类型:“https%3A%2F%2Fwww.supergaming.com%2Fr%2Fgaming%2Farticle%2Fpvmqe%2Fwas\u浏览\u steam\u应用程序\u评论\u和\uuu%E0%B2%A0\u0%E0%B2%A0%2F”

如果URL包含特殊字符,那么正确的读取方式是什么?

您列出的链接可能是针对Python 2.7的。我的问题与Python 3.7有关。您尝试过吗?我相信同样的答案也适用。我得到错误,“AttributeError:module'urllib'没有属性'quote'“。在Python3中,这将是
urllib.request.quote
。您列出的链接的可能副本适用于Python2.7。我的问题与Python3.7有关。您试过了吗?我相信同样的答案也适用。我得到了一个错误,“AttributeError:module'urllib'没有属性'quote'”。在Python 3中,这将是
urllib.request.quote