Memory leaks 带有tf.py_func的队列读取器产生内存泄漏_Memory Leaks_Tensorflow_Queue_Reader

Memory leaks 带有tf.py_func的队列读取器产生内存泄漏

memory-leaks tensorflow

Memory leaks 带有tf.py_func的队列读取器产生内存泄漏,memory-leaks,tensorflow,queue,reader,Memory Leaks,Tensorflow,Queue,Reader,我试图编写一个队列读取器，它遍历一个大文件，并在将其传递给实际操作之前在每一行上运行一个python函数我使用string\u input\u producer读取单个.tsv文件。然后我用tf.TextLineReader创建一个队列，并用tf.py_func增强每一行。这样做，我注意到一些内存泄漏，只有在调用tf.py_func时才会生效（是的，甚至作为noop）运行以下代码将产生以下结果： $ python test_memory.py 2> /dev/null run WITH

我试图编写一个队列读取器，它遍历一个大文件，并在将其传递给实际操作之前在每一行上运行一个python函数

我使用

string\u input\u producer

读取单个

.tsv

文件。然后我用

tf.TextLineReader

创建一个队列，并用

tf.py_func

增强每一行。这样做，我注意到一些内存泄漏，只有在调用

tf.py_func

时才会生效（是的，甚至作为noop）

运行以下代码将产生以下结果：

$ python test_memory.py 2> /dev/null
run WITHOUT tf.py_func
00001/50000, 1.4260% mem
05001/50000, 1.4512% mem
10001/50000, 1.4512% mem
15001/50000, 1.4512% mem
20001/50000, 1.4512% mem
25001/50000, 1.4516% mem
30001/50000, 1.4516% mem
35001/50000, 1.4516% mem
40001/50000, 1.4516% mem
45001/50000, 1.4516% mem
50000/50000, 1.4516% mem
===========================
run WITH tf.py_func
00001/50000, 1.4975% mem
05001/50000, 1.5051% mem
10001/50000, 1.5066% mem
15001/50000, 1.5081% mem
20001/50000, 1.5110% mem
25001/50000, 1.5137% mem
30001/50000, 1.5148% mem
35001/50000, 1.5165% mem
40001/50000, 1.5195% mem
45001/50000, 1.5210% mem
50000/50000, 1.5235% mem
===========================

如您所见，在不使用tf.py_func的情况下运行代码可以保持使用的内存稳定，而在使用python函数的情况下运行代码会使内存不断增加。这种效果在行较大的文件上更加明显

测试内存.py

：

import os
import sys
import psutil
import tensorflow as tf

def py_funner(x, do_py=True):
    '''
    this function returns the exact input.
    if do_py==True, it passes the data through a python noop using tf.py_func
    '''
    if do_py:
        def py_func(y):
            # this is just another noop.
            return y
        # py_func wraps a python function as a tensorflow op.
        return tf.py_func(py_func, [x], [tf.string], stateful=False)[0]
    else:
        return x

def get_data(do_py=True):
    # take the code as input. the effect is way more pronounced on larger files,
    # e.g., a tsv that encode image data in base64, as for ms-celeb-1m
    in_str = os.__file__

    # produce a queue that reads the one file row by row.
    input_queue = tf.train.string_input_producer([in_str])
    reader = tf.TextLineReader()
    ind, row = reader.read(input_queue)

    # call the wrapper to either include tf.py_func or not.
    return py_funner(row, do_py=do_py)

def main():
    # get the current proccess to monitor memory usage
    process = psutil.Process(os.getpid())

    # execute the same code both with a tf.py_func noop and without it
    for tt in [False, True]:
        print 'run WITH%s tf.py_func'%('' if tt else 'OUT')

        # generate the data queue
        data = get_data(do_py=tt)

        # start the session and the queue coordinator
        sess = tf.Session()
        coord = tf.train.Coordinator()
        queue_threads = tf.train.start_queue_runners(sess, coord=coord)

        # read a lot of the file
        max_iter = 50000
        for i in range(max_iter):
            run_ops = [data]
            d = sess.run(run_ops)
            mem = process.memory_percent()
            print '\r%05d/%d, %.4f%% mem'%(i+1, max_iter, mem),
            sys.stdout.flush()
            if i%5000==0:
                print
        print '\n==========================='

if __name__=='__main__':
    main()

我很感激任何关于如何进一步调试的建议和想法？！也许有什么办法可以看出python函数是否保留了某种存储

谢谢

不推荐使用队列读取器。你能用tf.data pipelines重现这些吗？不推荐使用队列读取器。你能用tf.data管道重现这些吗？