Lambda Python Pool.map和urllib2.urlopen:只重试失败的进程,只记录错误
我有一个AWS Lambda函数,它使用Lambda Python Pool.map和urllib2.urlopen:只重试失败的进程,只记录错误,python,lambda,urllib2,Python,Lambda,Urllib2,我有一个AWS Lambda函数,它使用pool.map调用一组URL。问题是,如果其中一个URL返回的不是200,Lambda函数将失败并立即重试。问题是它会立即重试整个lambda函数。我希望它只重试失败的URL,如果(第二次尝试后)仍然失败,请调用固定URL记录错误 这是它当前所在的代码(删除了一些详细信息),仅当所有URL为: from __future__ import print_function import urllib2 from multiprocessing.dummy
pool.map
调用一组URL。问题是,如果其中一个URL返回的不是200
,Lambda函数将失败并立即重试。问题是它会立即重试整个lambda函数。我希望它只重试失败的URL,如果(第二次尝试后)仍然失败,请调用固定URL记录错误
这是它当前所在的代码(删除了一些详细信息),仅当所有URL为:
from __future__ import print_function
import urllib2
from multiprocessing.dummy import Pool as ThreadPool
import hashlib
import datetime
import json
print('Loading function')
def lambda_handler(event, context):
f = urllib2.urlopen("https://example.com/geturls/?action=something");
data = json.loads(f.read());
urls = [];
for d in data:
urls.append("https://"+d+".example.com/path/to/action");
# Make the Pool of workers
pool = ThreadPool(4);
# Open the urls in their own threads
# and return the results
results = pool.map(urllib2.urlopen, urls);
#close the pool and wait for the work to finish
pool.close();
return pool.join();
我试着阅读,但它似乎在解释map
函数方面有点欠缺,特别是解释返回值
使用文档,我尝试将代码修改为以下内容:
from __future__ import print_function
import urllib2
from multiprocessing.dummy import Pool as ThreadPool
import hashlib
import datetime
import json
print('Loading function')
def lambda_handler(event, context):
f = urllib2.urlopen("https://example.com/geturls/?action=something");
data = json.loads(f.read());
urls = [];
for d in data:
urls.append("https://"+d+".example.com/path/to/action");
# Make the Pool of workers
pool = ThreadPool(4);
# Open the urls in their own threads
# and return the results
try:
results = pool.map(urllib2.urlopen, urls);
except URLError:
try: # try once more before logging error
urllib2.urlopen(URLError.url); # TODO: figure out which URL errored
except URLError: # log error
urllib2.urlopen("https://example.com/error/?url="+URLError.url);
#close the pool and wait for the work to finish
pool.close();
return true; # always return true so we never duplicate successful calls
我不确定这样做是否正确,或者我是否正确地使用python异常表示法。同样,我的目标是我希望它只重试失败的URL,如果(第二次尝试后)仍然失败,请调用一个固定的URL来记录错误。多亏了 答案是为
urllib2.urlopen
函数创建我自己的自定义包装,因为每个线程本身都需要尝试{}捕获,而不是整个过程。该函数看起来是这样的:
def my_urlopen(url):
try:
return urllib2.urlopen(url)
except URLError:
urllib2.urlopen("https://example.com/log_error/?url="+url)
return None
我将其放在def lambda_处理程序
函数声明之上,然后我可以从以下内容替换其中的整个try/catch:
try:
results = pool.map(urllib2.urlopen, urls);
except URLError:
try: # try once more before logging error
urllib2.urlopen(URLError.url);
except URLError: # log error
urllib2.urlopen("https://example.com/error/?url="+URLError.url);
为此:
results = pool.map(my_urlopen, urls);
Q.E.D