Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/311.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 为什么可以';我是否使用urllib.request.urlopen(url.read())解码zhihu.com响应?_Python_Unicode_Urllib_Python 3.4 - Fatal编程技术网

Python 为什么可以';我是否使用urllib.request.urlopen(url.read())解码zhihu.com响应?

Python 为什么可以';我是否使用urllib.request.urlopen(url.read())解码zhihu.com响应?,python,unicode,urllib,python-3.4,Python,Unicode,Urllib,Python 3.4,我在中找到了以下示例: 这个输出 Nov. 25, 09:43:32 PM EST 我试图复制该代码用于中文网站: import urllib.request url = 'http://www.zhihu.com' response = urllib.request.urlopen(url).read().decode("utf-8") print(response) 但我有一个错误: UnicodeDecodeError: 'utf-8' codec can't decode byt

我在中找到了以下示例:

这个输出

Nov. 25, 09:43:32 PM EST
我试图复制该代码用于中文网站:

import urllib.request

url = 'http://www.zhihu.com'
response = urllib.request.urlopen(url).read().decode("utf-8")
print(response) 
但我有一个错误:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte. 

如何解决此问题?

即使您没有请求Gzip响应,该站点仍会返回Gzip响应:

>>> from urllib.request import urlopen
>>> url = 'http://www.zhihu.com'
>>> response = urlopen(url)
>>> response.info().get('Content-Encoding')
'gzip'
这违反了HTTP RFCs,即使您明确禁止,网站也会这样做:

>>> from urllib.request import Request
>>> response = urlopen(Request(url, headers={'Accept-Encoding': 'identity,gzip;q=0'}))
>>> response.info().get('Content-Encoding')
'gzip'
您必须首先解压缩响应数据,然后才能将结果字节解码为UTF-8:

>>> import zlib
>>> decompressed_data = zlib.decompress(response.read(), 16+zlib.MAX_WBITS)
>>> print(*decompressed_data.decode('utf8').splitlines(True)[:10])
<!DOCTYPE html>
 <html lang="zh-CN">
 <head>
 <meta charset="utf-8">
 <meta name="apple-itunes-app" content="app-id=432274380">
 <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1" />
 <meta name="renderer" content="webkit" />
 <meta name="description" content="一个真实的网络问答社区,帮助你寻找答案,分享知识。"/>
 <meta name="viewport" content="user-scalable=no, width=device-width, initial-scale=1.0, maximum-scale=1.0"/>
 <title>知乎 - 与世界分享你的知识、经验和见解</title>
导入zlib >>>decompressed_data=zlib.decompress(response.read(),16+zlib.MAX_WBITS) >>>打印(*解压缩的_数据。解码('utf8')。拆分行(真)[:10]) 知乎 - 与世界分享你的知识、经验和见解
>>> import zlib
>>> decompressed_data = zlib.decompress(response.read(), 16+zlib.MAX_WBITS)
>>> print(*decompressed_data.decode('utf8').splitlines(True)[:10])
<!DOCTYPE html>
 <html lang="zh-CN">
 <head>
 <meta charset="utf-8">
 <meta name="apple-itunes-app" content="app-id=432274380">
 <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1" />
 <meta name="renderer" content="webkit" />
 <meta name="description" content="一个真实的网络问答社区,帮助你寻找答案,分享知识。"/>
 <meta name="viewport" content="user-scalable=no, width=device-width, initial-scale=1.0, maximum-scale=1.0"/>
 <title>知乎 - 与世界分享你的知识、经验和见解</title>