Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/314.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 泄漏TarInfo对象_Python_Python 3.x_Memory Leaks_Tar_Objgraph - Fatal编程技术网

Python 泄漏TarInfo对象

Python 泄漏TarInfo对象,python,python-3.x,memory-leaks,tar,objgraph,Python,Python 3.x,Memory Leaks,Tar,Objgraph,我有一个Python实用程序,它遍历tar.xz文件并处理每个单独的文件。这是一个15MB的压缩文件,包含740MB的未压缩数据 在一个内存非常有限的特定服务器上,程序崩溃是因为内存不足。我过去常常看到创建了哪些对象。结果是,TarInfo实例没有被释放。主回路与此类似: with tarfile.open(...) as tar: while True: next = tar.next() stream = tar.extractfile(next)

我有一个Python实用程序,它遍历
tar.xz
文件并处理每个单独的文件。这是一个15MB的压缩文件,包含740MB的未压缩数据

在一个内存非常有限的特定服务器上,程序崩溃是因为内存不足。我过去常常看到创建了哪些对象。结果是,
TarInfo
实例没有被释放。主回路与此类似:

with tarfile.open(...) as tar:
    while True:
        next = tar.next()
        stream = tar.extractfile(next)
        process_stream()
        iter+=1
        if not iter%1000:
            objgraph.show_growth(limit=10)
输出非常一致:

TarInfo     2040     +1000
TarInfo     3040     +1000
TarInfo     4040     +1000
TarInfo     5040     +1000
TarInfo     6040     +1000
TarInfo     7040     +1000
TarInfo     8040     +1000
TarInfo     9040     +1000
TarInfo    10040     +1000
TarInfo    11040     +1000
TarInfo    12040     +1000
这将一直持续到处理完所有30000个文件为止

为了确保这一点,我已经注释掉了创建流并对其进行处理的行。内存使用保持不变-TarInfo实例泄漏


我使用的是Python 3.4.1,这种行为在Ubuntu、OS X和Windows上是一致的。

看起来这实际上是设计的。
TarFile
对象维护它在
members
属性中包含的所有
TarInfo
对象的列表。每次调用时,它从存档中提取的
TarInfo
对象都会添加到列表中:

def next(self):
    """Return the next member of the archive as a TarInfo object, when
       TarFile is opened for reading. Return None if there is no more
       available.
    """
    self._check("ra")
    if self.firstmember is not None:
        m = self.firstmember
        self.firstmember = None
        return m

    # Read the next block.
    self.fileobj.seek(self.offset)
    tarinfo = None
    ... <snip>

    if tarinfo is not None:
        self.members.append(tarinfo)  # <-- the TarInfo instance is added to members
with tarfile.open(...) as tar:
    while True:
        next = tar.next()
        stream = tar.extractfile(next)
        process_stream()
        iter+=1
        tar.members = []  # Clear members list
        if not iter%1000:
            objgraph.show_growth(limit=10)