Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/326.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 下载.html文件时输出不正确_Python_Urllib2 - Fatal编程技术网

Python 下载.html文件时输出不正确

Python 下载.html文件时输出不正确,python,urllib2,Python,Urllib2,我只想下载python中的.html文件。 代码: 但这有时会产生这种输出 我一点也不明白。 我使用python中的goose读取此文件,当我读取这样一个文件时,它没有显示任何内容 当它不起作用时: 查看响应标题: >> print page.info() Cache-Control: public, max-age=300, s-maxage=300 Content-Type: text/html;charset=utf-8 Server: fs3 Age: 103 Expires

我只想下载python中的.html文件。 代码:

但这有时会产生这种输出 我一点也不明白。 我使用python中的goose读取此文件,当我读取这样一个文件时,它没有显示任何内容

当它不起作用时:

查看响应标题:

>> print page.info()
Cache-Control: public, max-age=300, s-maxage=300
Content-Type: text/html;charset=utf-8
Server: fs3
Age: 103
Expires: Thu, 21 May 2015 13:36:40 GMT
Content-Encoding: gzip
Transfer-Encoding: chunked
Connection: close
Vary: Accept-encoding, Accept-Encoding
我看到的内容是gzip,尝试使用模块解压缩数据

要检查数据是否已gzip,请添加以下行:

if page.info().get('Content-Encoding', '') == 'gzip':
    ... # decompress data
请阅读,以获得如何解压缩正文的示例。

使用
为您完成所有工作。使用
让请求处理编码:

import requests

urls=['http://www.nydailynews.com/sports/soccer-fans-stampede-south-african-stadium-nigeria-north-korea-world-cup-warmup-article-1.179211']

 path='C:/Users/sony/Desktop/Python'

for i,site in enumerate(urls):
    print (site)
    req = requests.get(site)
    page_content = req.content
    with open ('{}{}.html'.format(path,i), 'w') as fid:
        fid.write(page_content)
输出:

 <!DOCTYPE html> <!--NEW--> <!--- www pageHead.vm ---> <!--- mode=www ---> <!--- URI=/sports/soccer-fans-stampede-south-african-stadium-nigeria-north-korea-world-cup-warmup-article-1.179211 ---> <!--- Host=www.nydailynews.com ---> <!--[if IE 8]><html class="ie8" lang="en" itemscope itemtype="http://schema.org/"><![endif]--> <!--[if IE 9]><html class="ie9" lang="en" itemscope itemtype="http://schema.org/"><![endif]--> <!--[if IE 10]><html class="ie10" lang="en" itemscope itemtype="http://schema.org/"><![endif]--> <!--[if IE 11]><html class="ie11" lang="en" itemscope itemtype="http://schema.org/"><![endif]--> <!--[if !IE]><!--> <html lang="en" itemscope itemtype="http://schema.org/"> <!--<![endif]-->       <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> <meta http-equiv="X-UA-Compatible" content="IE=edge"/>          <title>Fans stampede outside South African stadium - NY Daily News</title>     <meta name="nydn_section" content="Sports"/>   <meta name="viewport" content="width=1070, maximum-scale=1.0"/>  <meta property="fb:app_id" content="107464888913"/> <meta property="fb:admins" content="1594068001"/> <meta property="og:site_name" content="NY Daily News"/> <meta property="article:publisher" content="https://www.facebook.com/thenewyorkdailynews"/> <meta name="msvalidate.01" content="02916AAC0DA8B068EFE01D721E03ED7E"/>    <meta name="twitter:card" content="summary"> <meta name="twitter:site" content="@nydailynews"> <meta property="twitter:url" content="http://www.nydailynews.com/sports/soccer-fans-stampede-south-african-stadium-nigeria-north-korea-world-cup-warmup-article-1.179211"/> <meta property="twitter:title" content="Fans stampede outside South African stadium"/> <meta property="twitter:description" content="Thousands of fans stampeded outside the stadium gates of a World Cup warmup game Sunday, five days before the start of soccer's showcase event. Several fans could be seen falling under the crush of people, many wearing Nigeria jerseys."/> <meta id="og_title" property="og:title" content="Fans stampede outside South African stadium"/> <meta property="og:type" content="article"/> <meta id="og_url" property="og:url" content="http://www.nydailynews.com/sports/soccer-fans-stampede-south-african-stadium-nigeria-north-korea-world-cup-warmup-article-1.179211"/>   <meta id="og_image" property="og:image" content="http://assets.nydailynews.com/polopoly_fs/1.179213!/img/httpImage/image.jpg_gen/derivatives/landscape_1200/alg-stampede-johannesburg-jpg.jpg"/>   <meta id="og_description" property="og:description" content="Thousands of fans stampeded outside the stadium gates of a World Cup warmup game Sunday, five days before the start of soccer's showcase event. Several fans could be seen falling under the crush of people, many wearing Nigeria jerseys."/> <meta name="description" content="Thousands of fans stampeded outside the stadium gates of a World Cup warmup game Sunday, five days before the start of soccer's showcase event. Several fans could be seen falling under the crush of people, many wearing Nigeria jerseys."/>   <meta name="nydn_byline" content="MICHAEL LEWIS"/>   <link rel="stylesheet" type="text/css" href="http://assets.nydailynews.com/nydn/c/nydn.css?r=20120405mk1Bh">  <meta name="parsely-title" content="Fans stampede outside South African stadium"/> <meta name="parsely-link" content="http://www.nydailynews.com/sports/soccer-fans-stampede-south-african-stadium-nigeria-north-korea-world-cup-warmup-article-1.179211"/> <meta name="parsely-type" content="article"/> <meta name="parsely-image-url" content="http://assets.nydailynews.com/polopoly_fs/1.179213!/img/httpImage/image.jpg_gen/derivatives/landscape_1200/alg-stampede-johannesburg-jpg.jpg"/>    <meta name="parsely-pub-date" content="2010-06-06T15:01:04"/>   <meta name="parsely-section" content="Sports"/>   <meta name="parsely-author" content="Michael Lewis"/>       <link rel="stylesheet" type="text/css" href="http://assets.nydailynews.com/nydn/c/article.css?r=20120405mk1Bh">       <meta name="robots" content="NOARCHIVE"/>         <link rel="canonical" href="http://www.nydailynews.com/sports/soccer-fans-stampede-south-african-stadium-nigeria-north-korea-world-cup-warmup-article-1.179211">  <link rel="alternate" media="handheld" href="http://m.nydailynews.com/sports/soccer-fans-stampede-south-african-stadium-nigeria-north-korea-world-cup-warmup-article-1.179211"> <link rel="alternate" media="only screen and (max-width: 640px)" href="http://m.nydailynews.com/sports/soccer-fans-stampede-south-african-stadium-nigeria-north-korea-world-cup-warmup-article-1.179211"/>     <script type="text/javascript" src="http://assets.nydailynews.com/nydn/js/nydn-pack-20140101.js?r=20120405mk1Bh"></script>   <script type="text/javascript" src="http://assets.nydailynews.com/nydn/js/article2014.js?r=20120405mk1Bh"></script>          <!--[if lt IE 9]><script src="http://html5shiv.googlecode.com/svn/trunk/html5.js"></script><![endif]-->      


                              <link rel="alternate" type="application/rss+xml" title="NYDN Rss" href="http://feeds.nydailynews.com/nydnrss">              <link rel="alternate" type="application/rss+xml" title="Sports Rss" href="http://feeds.feedburner.com/nydnrss/sports">       

..........................

@卢茨霍恩,请现在检查一下。感谢您以前没有提及此事。@schelzz15您有什么建议吗?@schelzz15两点-1。因为我对很多URL都这样做,而这种情况并不存在。有什么可能的方法来检查它吗。你能提供一个例子吗?@AbhishekBhatia我已经在一个例子中添加了一个URL来说明如何做到这一点。因为所有这些都不在gzip中。我想也应该有一张支票。你能具体说明一下如何做到这一点吗?我不认为http服务器应该发送gzip编码的数据,除非客户端列出了它们支持的数据(我也不认为urllib2是现成的)。由于在缓慢的internet连接上工作,我面临一些超时问题。你认为这也能解决这个问题吗?或者我应该在上面明确指定一个超时吗?Cunninghan您的代码
与open(path{}.html'.format(I),“w')作为fid:^SyntaxError:无效语法
是的。错过了开场白。固定的。如果超时,你希望发生什么?现在就试试。在手机上工作很难看清正确的方向如果发生这种情况,你想继续尝试吗?
 <!DOCTYPE html> <!--NEW--> <!--- www pageHead.vm ---> <!--- mode=www ---> <!--- URI=/sports/soccer-fans-stampede-south-african-stadium-nigeria-north-korea-world-cup-warmup-article-1.179211 ---> <!--- Host=www.nydailynews.com ---> <!--[if IE 8]><html class="ie8" lang="en" itemscope itemtype="http://schema.org/"><![endif]--> <!--[if IE 9]><html class="ie9" lang="en" itemscope itemtype="http://schema.org/"><![endif]--> <!--[if IE 10]><html class="ie10" lang="en" itemscope itemtype="http://schema.org/"><![endif]--> <!--[if IE 11]><html class="ie11" lang="en" itemscope itemtype="http://schema.org/"><![endif]--> <!--[if !IE]><!--> <html lang="en" itemscope itemtype="http://schema.org/"> <!--<![endif]-->       <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> <meta http-equiv="X-UA-Compatible" content="IE=edge"/>          <title>Fans stampede outside South African stadium - NY Daily News</title>     <meta name="nydn_section" content="Sports"/>   <meta name="viewport" content="width=1070, maximum-scale=1.0"/>  <meta property="fb:app_id" content="107464888913"/> <meta property="fb:admins" content="1594068001"/> <meta property="og:site_name" content="NY Daily News"/> <meta property="article:publisher" content="https://www.facebook.com/thenewyorkdailynews"/> <meta name="msvalidate.01" content="02916AAC0DA8B068EFE01D721E03ED7E"/>    <meta name="twitter:card" content="summary"> <meta name="twitter:site" content="@nydailynews"> <meta property="twitter:url" content="http://www.nydailynews.com/sports/soccer-fans-stampede-south-african-stadium-nigeria-north-korea-world-cup-warmup-article-1.179211"/> <meta property="twitter:title" content="Fans stampede outside South African stadium"/> <meta property="twitter:description" content="Thousands of fans stampeded outside the stadium gates of a World Cup warmup game Sunday, five days before the start of soccer's showcase event. Several fans could be seen falling under the crush of people, many wearing Nigeria jerseys."/> <meta id="og_title" property="og:title" content="Fans stampede outside South African stadium"/> <meta property="og:type" content="article"/> <meta id="og_url" property="og:url" content="http://www.nydailynews.com/sports/soccer-fans-stampede-south-african-stadium-nigeria-north-korea-world-cup-warmup-article-1.179211"/>   <meta id="og_image" property="og:image" content="http://assets.nydailynews.com/polopoly_fs/1.179213!/img/httpImage/image.jpg_gen/derivatives/landscape_1200/alg-stampede-johannesburg-jpg.jpg"/>   <meta id="og_description" property="og:description" content="Thousands of fans stampeded outside the stadium gates of a World Cup warmup game Sunday, five days before the start of soccer's showcase event. Several fans could be seen falling under the crush of people, many wearing Nigeria jerseys."/> <meta name="description" content="Thousands of fans stampeded outside the stadium gates of a World Cup warmup game Sunday, five days before the start of soccer's showcase event. Several fans could be seen falling under the crush of people, many wearing Nigeria jerseys."/>   <meta name="nydn_byline" content="MICHAEL LEWIS"/>   <link rel="stylesheet" type="text/css" href="http://assets.nydailynews.com/nydn/c/nydn.css?r=20120405mk1Bh">  <meta name="parsely-title" content="Fans stampede outside South African stadium"/> <meta name="parsely-link" content="http://www.nydailynews.com/sports/soccer-fans-stampede-south-african-stadium-nigeria-north-korea-world-cup-warmup-article-1.179211"/> <meta name="parsely-type" content="article"/> <meta name="parsely-image-url" content="http://assets.nydailynews.com/polopoly_fs/1.179213!/img/httpImage/image.jpg_gen/derivatives/landscape_1200/alg-stampede-johannesburg-jpg.jpg"/>    <meta name="parsely-pub-date" content="2010-06-06T15:01:04"/>   <meta name="parsely-section" content="Sports"/>   <meta name="parsely-author" content="Michael Lewis"/>       <link rel="stylesheet" type="text/css" href="http://assets.nydailynews.com/nydn/c/article.css?r=20120405mk1Bh">       <meta name="robots" content="NOARCHIVE"/>         <link rel="canonical" href="http://www.nydailynews.com/sports/soccer-fans-stampede-south-african-stadium-nigeria-north-korea-world-cup-warmup-article-1.179211">  <link rel="alternate" media="handheld" href="http://m.nydailynews.com/sports/soccer-fans-stampede-south-african-stadium-nigeria-north-korea-world-cup-warmup-article-1.179211"> <link rel="alternate" media="only screen and (max-width: 640px)" href="http://m.nydailynews.com/sports/soccer-fans-stampede-south-african-stadium-nigeria-north-korea-world-cup-warmup-article-1.179211"/>     <script type="text/javascript" src="http://assets.nydailynews.com/nydn/js/nydn-pack-20140101.js?r=20120405mk1Bh"></script>   <script type="text/javascript" src="http://assets.nydailynews.com/nydn/js/article2014.js?r=20120405mk1Bh"></script>          <!--[if lt IE 9]><script src="http://html5shiv.googlecode.com/svn/trunk/html5.js"></script><![endif]-->      


                              <link rel="alternate" type="application/rss+xml" title="NYDN Rss" href="http://feeds.nydailynews.com/nydnrss">              <link rel="alternate" type="application/rss+xml" title="Sports Rss" href="http://feeds.feedburner.com/nydnrss/sports">       

..........................
def tries(path, url, i, max_tries=1):
    for ty in range(1, max_tries+1):
        try:
            req = requests.get(url)
            page_content = req.content
            with open('{}{}.html'.format(path, i), 'w') as fid:
                fid.write(page_content)
            break
        except requests.exceptions.ConnectionError as e:
            print("Error {} for try {}".format(e, ty))


for ind, url in enumerate(urls):
    tries(path, url, ind, 4)