Python 断开连接上的Tornado内存泄漏_Python_Asynchronous_Memory Leaks_Tornado

Python 断开连接上的Tornado内存泄漏

python asynchronous memory-leaks

Python 断开连接上的Tornado内存泄漏,python,asynchronous,memory-leaks,tornado,Python,Asynchronous,Memory Leaks,Tornado,我有一个装置，龙卷风被用作工人的通行证。Tornado接收到请求，它将该请求发送给N个工作人员，聚合结果并将其发送回客户端。这很好，除非由于某种原因超时-然后我有内存泄漏我有一个类似于此伪代码的设置： workers = ["http://worker1.example.com:1234/", "http://worker2.example.com:1234/", "http://worker3.example.com:1234/" ...] c

我有一个装置，龙卷风被用作工人的通行证。Tornado接收到请求，它将该请求发送给N个工作人员，聚合结果并将其发送回客户端。这很好，除非由于某种原因超时-然后我有内存泄漏

我有一个类似于此伪代码的设置：

workers = ["http://worker1.example.com:1234/",
           "http://worker2.example.com:1234/", 
           "http://worker3.example.com:1234/" ...]

class MyHandler(tornado.web.RequestHandler):
    @tornado.web.asynchronous
    def post(self):
        responses = []

        def __callback(response):
            responses.append(response)
            if len(responses) == len(workers):
                self._finish_req(responses)

        for url in workers:
            async_client = tornado.httpclient.AsyncHTTPClient()
            request = tornado.httpclient.HTTPRequest(url, method=self.request.method, body=body)
            async_client.fetch(request, __callback) 

    def _finish_req(self, responses):
        good_responses = [r for r in responses if not r.error]
        if not good_responses:
            raise tornado.web.HTTPError(500, "\n".join(str(r.error) for r in responses))
        results = aggregate_results(good_responses)
        self.set_header("Content-Type", "application/json")
        self.write(json.dumps(results))
        self.finish()

application = tornado.web.Application([
    (r"/", MyHandler),
])

if __name__ == "__main__":
    ##.. some locking code 
    application.listen()
    tornado.ioloop.IOLoop.instance().start()

我做错了什么？内存泄漏从何而来

我不知道问题的根源，似乎gc应该能够解决它，但有两件事你可以尝试

第一种方法是简化一些引用（当RequestHandler完成时，似乎仍然有对

响应的引用）：
如果这不起作用，则始终可以手动调用垃圾回收：
import gc
class MyHandler(tornado.web.RequestHandler):
    @tornado.web.asynchronous
    def post(self):
        ....

    def _finish_req(self):
        ....

    def on_connection_close(self):
        gc.collect()

代码看起来不错。泄漏可能在龙卷风内部
我只是被这句话绊倒了：
async_client = tornado.httpclient.AsyncHTTPClient()

你知道这个构造函数中的实例化魔法吗？
从文档中：
"""
The constructor for this class is magic in several respects:  It actually
creates an instance of an implementation-specific subclass, and instances
are reused as a kind of pseudo-singleton (one per IOLoop).  The keyword
argument force_instance=True can be used to suppress this singleton
behavior.  Constructor arguments other than io_loop and force_instance
are deprecated.  The implementation subclass as well as arguments to
its constructor can be set with the static method configure()
"""

所以实际上，你不需要在循环中这样做。（另一方面
不过，它应该不会造成任何伤害。）但您是哪种实现
使用CurlAsyncHTTPClient或SimpleAsyncHTTPClient
如果是SimpleAsyncHTTPClient，请注意代码中的以下注释：
"""
This class has not been tested extensively in production and
should be considered somewhat experimental as of the release of
tornado 1.2. 
"""

您可以尝试切换到HttpClient。或跟随
Nikolay Fominyh的建议，并跟踪对_callback（）的调用
 如果len（响应）==len（工作者）：

，我不喜欢这个

-你确定应用程序总是在这里吗？尝试记录批量请求的尝试和成功的尝试。@Nikolay:好的，好吧，Tornado使用回调来表示成功和错误。因此，我很确定，不管有多少工人失败了，总会得到那么多的回应。我不确定的是，当客户端取消请求时会发生什么。另外，如果您有10个以上的工作人员，并且他们都在超时时死亡-您有一段时间tornado无法创建新连接-我不知道此时它的行为如何。尝试使用max\u clients
参数。在这种情况下，我会使用队列
，因为它是线程安全的，并且会在所有作业完成时通知您。您如何知道内存泄漏？您的服务器内存是否已满，或者是分析提醒了您这个问题？来自pythongc
documentationgc.garbage：收集器发现无法访问但无法释放的对象列表。我注意到这个列表在启动时是空的，但在每个请求时都会添加。
"""
This class has not been tested extensively in production and
should be considered somewhat experimental as of the release of
tornado 1.2. 
"""