如何在python中优雅地记录多个非常相似的事件？_Python_Exception_Logging

如何在python中优雅地记录多个非常相似的事件？

python exception logging

如何在python中优雅地记录多个非常相似的事件？,python,exception,logging,Python,Exception,Logging,使用pythons模块，是否有方法将多个事件收集到一个日志条目中？理想的解决方案是对python的日志记录模块进行扩展，或者为其提供一个自定义格式化程序/过滤器，因此收集同类日志事件发生在后台，无需在代码体中添加任何内容（例如，在每次调用日志函数时）这里有一个示例，它生成大量相同或非常相似的日志记录事件： import logging for i in range(99999): try: asdf[i] # not defined! except Na

使用pythons模块，是否有方法将多个事件收集到一个日志条目中？理想的解决方案是对python的

日志记录

模块进行扩展，或者为其提供一个自定义格式化程序/过滤器，因此收集同类日志事件发生在后台，无需在代码体中添加任何内容（例如，在每次调用日志函数时）

这里有一个示例，它生成大量相同或非常相似的日志记录事件：

import logging

for i in range(99999): 
    try:
        asdf[i]   # not defined!
    except NameError:
        logging.exception('foo') # generates large number of logging events
    else: pass

# ... more code with more logging ...

for i in range(88888): logging.info('more of the same %d' % i)

# ... and so on ...

因此，我们有相同的异常99999次并记录它。如果日志上写的是这样的话，那就太好了：

ERROR:root:foo (occured 99999 times) Traceback (most recent call last): File "./exceptionlogging.py", line 10, in <module> asdf[i] # not defined! NameError: name 'asdf' is not defined INFO:root:foo more of the same (occured 88888 times with various values)

错误：root:foo（发生99999次）回溯（最近一次呼叫最后一次）：文件“/exceptionlogging.py”，第10行，在 asdf[i]#未定义！ NameError:未定义名称“asdf” 信息：root:foo更多相同的内容（使用不同的值出现88888次）
创建一个计数器，只记录
count=1
的计数器，然后在其后递增并在finally块中写入（以确保无论应用程序崩溃和烧坏的程度有多严重，都会记录该计数器）。如果出于不同原因出现相同的异常，这当然会引起问题，但您可以始终搜索行号以验证它是相同的问题或类似的问题。一个简单的例子：

name_error_exception_count = 0 try: for i in range(99999): try: asdf[i] # not defined! except NameError: name_error_exception_count += 1 if name_error_exception_count == 1: logging.exception('foo') else: pass except Exception: pass # this is just to get the finally block, handle exceptions here too, maybe finally: if name_error_exception_count > 0: logging.exception('NameError exception occurred {} times.'.format(name_error_exception_count))

您可能应该编写一个message aggregate/statistics类，而不是尝试挂接日志系统的，但我猜您可能有一个使用日志的现有代码库
我还建议您应该实例化日志记录器，而不是总是使用默认的根。这本书有大量的解释和例子
下面的课程应该按照你的要求做

import logging import atexit import pprint class Aggregator(object): logs = {} @classmethod def _aggregate(cls, record): id = '{0[levelname]}:{0[name]}:{0[msg]}'.format(record.__dict__) if id not in cls.logs: # first occurrence cls.logs[id] = [1, record] else: # subsequent occurrence cls.logs[id][0] += 1 @classmethod def _output(cls): for count, record in cls.logs.values(): record.__dict__['msg'] += ' (occured {} times)'.format(count) logging.getLogger(record.__dict__['name']).handle(record) @staticmethod def filter(record): # pprint.pprint(record) Aggregator._aggregate(record) return False @staticmethod def exit(): Aggregator._output() logging.getLogger().addFilter(Aggregator) atexit.register(Aggregator.exit) for i in range(99999): try: asdf[i] # not defined! except NameError: logging.exception('foo') # generates large number of logging events else: pass # ... more code with more logging ... for i in range(88888): logging.error('more of the same') # ... and so on ...
请注意，在程序退出之前，您不会获得任何日志
运行它的结果是：
ERROR:root:foo (occured 99999 times) Traceback (most recent call last): File "C:\work\VEMS\python\logcount.py", line 38, in asdf[i] # not defined! NameError: name 'asdf' is not defined ERROR:root:more of the same (occured 88888 times) 错误：root:foo（发生99999次）回溯（最近一次呼叫最后一次）：文件“C:\work\VEMS\python\logcount.py”，第38行，在 asdf[i]#未定义！ NameError:未定义名称“asdf” 错误：根目录：更多相同的（发生88888次）
您可以对logger类进行子类化，并重写exception方法，将错误类型放入缓存中，直到它们到达某个计数器，然后再发送到日志

import logging from collections import defaultdict MAX_COUNT = 99999 class MyLogger(logging.getLoggerClass()): def __init__(self, name): super(MyLogger, self).__init__(name) self.cache = defaultdict(int) def exception(self, msg, *args, **kwargs): err = msg.__class__.__name__ self.cache[err] += 1 if self.cache[err] > MAX_COUNT: new_msg = "{err} occurred {count} times.\n{msg}" new_msg = new_msg.format(err=err, count=MAX_COUNT, msg=msg) self.log(logging.ERROR, new_msg, *args, **kwargs) self.cache[err] = None log = MyLogger('main') try: raise TypeError("Useful error message") except TypeError as err: log.exception(err)
请注意，这不是复制粘贴代码。
您需要自己添加处理程序（我也推荐格式化程序）。

玩得开心。
你的问题隐藏了一个潜意识的假设，即“非常相似”是如何定义的。日志记录可以是仅常量（其实例完全相同），也可以是常量和变量的混合（没有常量也被视为混合）
仅用于常量日志记录的聚合器是小菜一碟。您只需要决定进程/线程是否会分叉聚合。对于同时包含常量和变量的日志记录，您需要根据记录中的变量决定是否拆分聚合
字典式计数器（来自集合导入计数器）可以用作缓存，它将以O（1）的形式对实例进行计数，但是如果愿意，您可能需要更高级别的结构来写下变量。此外，您还必须手动处理将缓存写入文件的操作—每X秒（binning）一次，或者一旦程序退出（有风险—如果某些内容被卡住，您可能会丢失所有内存中的数据）
聚合框架的外观如下（在Python v3.4上测试）：
如果您想添加更多内容，Python日志记录就是这样的：

{'args': ['()'], 'created': ['1413747902.18'], 'exc_info': ['None'], 'exc_text': ['None'], 'filename': ['push_socket_log.py'], 'funcName': ['<module>'], 'levelname': ['DEBUG'], 'levelno': ['10'], 'lineno': ['17'], 'module': ['push_socket_log'], 'msecs': ['181.387901306'], 'msg': ['Test message.'], 'name': ['__main__'], 'pathname': ['./push_socket_log.py'], 'process': ['65486'], 'processName': ['MainProcess'], 'relativeCreated': ['12.6709938049'], 'thread': ['140735262810896'], 'threadName': ['MainThread']}

{'args'：['（）']， “已创建”：[1413747902.18']， “exc_信息”：[“无”]， “exc_text”：[“无”]， 'filename'：['push_socket_log.py']， “funcName:[“”]， 'levelname'：['DEBUG']， “levelno”：[10']， '行号'：['17']， 'module'：['push_socket_log']， “MSEC”：[181.38790306']， “msg”：[“测试消息”。]， '名称'：[''主''']， “路径名”：['。/push_socket_log.py']， '进程'：['65486']， 'processName'：['MainProcess']， “relativeCreated”：[12.6709938049']， “线程”：[140735262810896']， 'threadName'：['MainThread']}
还有一件事需要考虑：您运行的大多数功能都依赖于多个连续命令流（理想情况下，这些命令会相应地报告日志记录）；e、 g.客户机-服务器通信通常取决于接收请求、处理请求、从数据库读取一些数据（需要连接和一些读取命令）、某种解析/处理、构造响应包和报告响应代码
这突出了使用聚合方法的一个主要缺点：通过聚合日志记录，您将无法跟踪所发生操作的时间和顺序。如果您手头只有聚合，则很难找出哪些请求的结构不正确。在这种情况下，我的建议是保留原始数据和聚合（使用两个文件处理程序或类似的方法），以便可以调查宏级别（聚合）和微观级别（正常日志记录）
然而，你仍然有责任发现出了问题，然后手动调查是什么原因造成的。在PC上开发时，这是一项非常简单的任务；但是在多个生产服务器中部署代码会使这些任务变得很麻烦，浪费大量时间。因此，有几家公司专门为日志管理开发产品。大多数人将相似的日志记录聚合在一起，但其他人将机器学习算法用于自动聚合和学习软件的行为。外包日志处理可以让您专注于产品，而不是bug

免责声明：我为一个这样的解决方案工作。
我想您需要添加一些钩子（可能是一个定制的logging.Formatter sub-c）
if __name__ == "__main__": import random import logging logger = logging.getLogger() handler = LogAggregatorHandler() logger.addHandler(handler) logger.addHandler(logging.StreamHandler()) logger.setLevel(logging.DEBUG) logger.info("entering logging loop") for i in range(25): # Randomly choose log severity: severity = random.choice([logging.DEBUG, logging.INFO, logging.WARN, logging.ERROR, logging.CRITICAL]) logger.log(severity, "test message number %s", i) logger.info("end of test code")

{'args': ['()'], 'created': ['1413747902.18'], 'exc_info': ['None'], 'exc_text': ['None'], 'filename': ['push_socket_log.py'], 'funcName': ['<module>'], 'levelname': ['DEBUG'], 'levelno': ['10'], 'lineno': ['17'], 'module': ['push_socket_log'], 'msecs': ['181.387901306'], 'msg': ['Test message.'], 'name': ['__main__'], 'pathname': ['./push_socket_log.py'], 'process': ['65486'], 'processName': ['MainProcess'], 'relativeCreated': ['12.6709938049'], 'thread': ['140735262810896'], 'threadName': ['MainThread']}