Lambda Python Pool.map和urllib2.urlopen：只重试失败的进程，只记录错误_Python_Lambda_Urllib2

Lambda Python Pool.map和urllib2.urlopen：只重试失败的进程，只记录错误

python lambda

Lambda Python Pool.map和urllib2.urlopen：只重试失败的进程，只记录错误,python,lambda,urllib2,Python,Lambda,Urllib2,我有一个AWS Lambda函数，它使用pool.map调用一组URL。问题是，如果其中一个URL返回的不是200，Lambda函数将失败并立即重试。问题是它会立即重试整个lambda函数。我希望它只重试失败的URL，如果（第二次尝试后）仍然失败，请调用固定URL记录错误这是它当前所在的代码（删除了一些详细信息），仅当所有URL为： from __future__ import print_function import urllib2 from multiprocessing.dummy

我有一个AWS Lambda函数，它使用

pool.map

调用一组URL。问题是，如果其中一个URL返回的不是

，Lambda函数将失败并立即重试。问题是它会立即重试整个lambda函数。我希望它只重试失败的URL，如果（第二次尝试后）仍然失败，请调用固定URL记录错误

这是它当前所在的代码（删除了一些详细信息），仅当所有URL为：

from __future__ import print_function
import urllib2 
from multiprocessing.dummy import Pool as ThreadPool 

import hashlib
import datetime
import json

print('Loading function')

def lambda_handler(event, context):

  f = urllib2.urlopen("https://example.com/geturls/?action=something");
  data = json.loads(f.read());

  urls = [];
  for d in data:
      urls.append("https://"+d+".example.com/path/to/action");

  # Make the Pool of workers
  pool = ThreadPool(4);

  # Open the urls in their own threads
  # and return the results
  results = pool.map(urllib2.urlopen, urls);

  #close the pool and wait for the work to finish 
  pool.close();
  return pool.join();

我试着阅读，但它似乎在解释

map

函数方面有点欠缺，特别是解释返回值

使用文档，我尝试将代码修改为以下内容：

from __future__ import print_function
import urllib2 
from multiprocessing.dummy import Pool as ThreadPool 

import hashlib
import datetime
import json

print('Loading function')

def lambda_handler(event, context):

  f = urllib2.urlopen("https://example.com/geturls/?action=something");
  data = json.loads(f.read());

  urls = [];
  for d in data:
      urls.append("https://"+d+".example.com/path/to/action");

  # Make the Pool of workers
  pool = ThreadPool(4);

  # Open the urls in their own threads
  # and return the results
  try:
     results = pool.map(urllib2.urlopen, urls);
  except URLError:
     try:                              # try once more before logging error
        urllib2.urlopen(URLError.url); # TODO: figure out which URL errored
     except URLError:                  # log error
        urllib2.urlopen("https://example.com/error/?url="+URLError.url);

  #close the pool and wait for the work to finish 
  pool.close();
  return true; # always return true so we never duplicate successful calls

我不确定这样做是否正确，或者我是否正确地使用python异常表示法。同样，我的目标是我希望它只重试失败的URL，如果（第二次尝试后）仍然失败，请调用一个固定的URL来记录错误。

多亏了

答案是为

urllib2.urlopen

函数创建我自己的自定义包装，因为每个线程本身都需要尝试{}捕获，而不是整个过程。该函数看起来是这样的：

def my_urlopen(url):
    try:
        return urllib2.urlopen(url)
    except URLError:
        urllib2.urlopen("https://example.com/log_error/?url="+url)
        return None

我将其放在

def lambda_处理程序

函数声明之上，然后我可以从以下内容替换其中的整个try/catch：

try:
   results = pool.map(urllib2.urlopen, urls);
except URLError:
   try:                              # try once more before logging error
      urllib2.urlopen(URLError.url);
   except URLError:                  # log error
      urllib2.urlopen("https://example.com/error/?url="+URLError.url);

为此：

results = pool.map(my_urlopen, urls);

Q.E.D