使用Python urllib2下载时忽略缺少的文件_Python_Python 2.7_Urllib2

使用Python urllib2下载时忽略缺少的文件

python python-2.7

使用Python urllib2下载时忽略缺少的文件,python,python-2.7,urllib2,Python,Python 2.7,Urllib2,问题：正如标题所述，我正在通过ftp从NOAA下载基于年份和日期的数据。我已经将我的脚本配置为经过几年的时间，每天下载数据。但是，脚本在没有文件存在的日子被挂起。发生的情况是，它只是不断地重新加载同一行，表示该文件不存在。没有时间。睡眠（5）脚本疯狂地打印到日志中解决方案：以某种方式跳过错过的一天，进入下一天。我已经探索了continue（可能我把它放错了位置），制作了一个空目录（不优雅，仍然不会移动过去丢失的一天）。我不知所措，我忽略了什么以下是脚本： ##Working 24km imp

问题：正如标题所述，我正在通过ftp从NOAA下载基于年份和日期的数据。我已经将我的脚本配置为经过几年的时间，每天下载数据。但是，脚本在没有文件存在的日子被挂起。发生的情况是，它只是不断地重新加载同一行，表示该文件不存在。没有时间。睡眠（5）脚本疯狂地打印到日志中

解决方案：以某种方式跳过错过的一天，进入下一天。我已经探索了continue（可能我把它放错了位置），制作了一个空目录（不优雅，仍然不会移动过去丢失的一天）。我不知所措，我忽略了什么

以下是脚本：

##Working 24km
import urllib2
import time
import os
import os.path

flink = 'ftp://sidads.colorado.edu/DATASETS/NOAA/G02156/24km/{year}/ims{year}{day}_24km_v1.1.asc.gz'
days = [str(d).zfill(3) for d in range(1,365,1)]
years = range(1998,1999)
flinks = [flink.format(year=year,day=day) for year in years for day in days]

from urllib2 import Request, urlopen, URLError

for fname in flinks:
    dl = False
    while dl == False:
        try:
            # req = urllib2.Request(fname)
            req = urllib2.urlopen(fname)
            with open('/Users/username/Desktop/scripts_hpc/scratch/'+fname.split('/')[-1], 'w') as dfile:
            dfile.write(req.read())
            print 'file downloaded'
            dl = True

        except URLError, e:
            #print 'sleeping'
            print e.reason
            #print req.info()
            print 'skipping day: ', fname.split('/')[-1],' was not processed for ims'
            continue
            '''
            if not os.path.isfile(fname):
                f = open('/Users/username/Desktop/scripts_hpc/empty/'+fname.split('/')[-1], 'w')
                print 'day was skipped'
            '''

            time.sleep(5)

        else:
            break
            #everything is fine

研究：我浏览了其他问题，它们很接近，但似乎没有一针见血，任何帮助都将不胜感激

谢谢大家!

我想当你站起来走开去喝点咖啡的时候，事情就变得明朗了。显然，我的while声明中出现了一些问题（仍不确定原因）。当我将其取出并添加pass而不是continue时，它的行为是正确的

下面是它现在的样子：

for fname in flinks:
    try:
        req = urllib2.urlopen(fname)
        with open('/Users/username/Desktop/scripts_hpc/scratch/'+fname.split('/')[-1], 'w') as dfile:
            dfile.write(req.read())
        print 'file downloaded'

    except URLError, e:
        print e.reason
        print 'skipping day: ', fname.split('/')[-1],' was not processed for ims'
        pass
        time.sleep(5)

在除之外的

上，使用

通过而不是

继续

，因为它只能在循环内使用（

用于

，

while

）

这样，您就不需要处理丢失的文件，因为Python将忽略错误并继续运行。

在当前示例中，您仍然应该看到“疯狂打印”，因为

continue

在睡眠之前。如果您希望完全跳过丢失的文件，则整个内部while循环是多余的。至于原因，

continue

语句继续了封闭的while循环，而不是for循环。

pass

在本例中没有任何作用，您不妨将其删除

pass

在Python中基本上意味着没有操作。