Python列表理解有时很慢

Python列表理解有时很慢,python,optimization,Python,Optimization,我不久前写的一些python代码再次困扰着我。它运行缓慢,我将问题隔离到列表创建。我正在处理一些相当大的列表,在我进行一些主要的重构(这可能无法完成)之前,我想了解一下专家们是否会推荐一些东西 如何提高此代码的性能? 全表意文字代码: 代码如下所示: OBJECTS_NUM = 200 if __name__ == "__main__": allLists = [] for i in range(0, 500): starttime =

我不久前写的一些python代码再次困扰着我。它运行缓慢,我将问题隔离到列表创建。我正在处理一些相当大的列表,在我进行一些主要的重构(这可能无法完成)之前,我想了解一下专家们是否会推荐一些东西

如何提高此代码的性能?

全表意文字代码:

代码如下所示:

OBJECTS_NUM = 200

if __name__ == "__main__":  
    allLists = []
    for i in range(0, 500):
        starttime = currentTimeMicro()

        newlist = [Obj() for k in range(0, OBJECTS_NUM)]

        endtime = currentTimeMicro()
        elapsed = endtime - starttime
        print('Elapsed ' + str(elapsed))
        allLists.append(newlist)
输出的一个片段是:

Elapsed 242
Elapsed 280
Elapsed 286
Elapsed 292
Elapsed 301
Elapsed 295
Elapsed 287
Elapsed 236
Elapsed 303
Elapsed 282
Elapsed 278
Elapsed 902
Elapsed 8909
Elapsed 167
Elapsed 129
Elapsed 164
Elapsed 183
Elapsed 160
Elapsed 166
Elapsed 159
Elapsed 158
Elapsed 127
Elapsed 158
Elapsed 158
Elapsed 157
Elapsed 169
Elapsed 538
Elapsed 155
Elapsed 128
Elapsed 169
Elapsed 156
Elapsed 157
Elapsed 156
Elapsed 161
Elapsed 157
Elapsed 127
Elapsed 168
Elapsed 158
Elapsed 172
Elapsed 154
Elapsed 546
Elapsed 156
Elapsed 128
Elapsed 159
因此,大多数情况下,创建列表大约需要200-300次,但有时会增加到500次,甚至8900次

我假设这是一种与内存相关的行为,但我对Python还远远不够精通,无法找出问题所在


如果只是创建相同的对象。。 然后考虑

替换:

    newlist = [Obj() for k in range(0, OBJECTS_NUM)]
与:

deepcopy()。
有了这个变化

import time
import copy
import gc
gc.disable()
def currentTimeMicro(): return int(round(time.time() * 1000000))


x = 0


class Obj(object):
    def __init__(self):
        self.dummy = 0
        self.dumb = 42
        self.dumber = 'ftw'
        self.dummy1 = 0
        self.dumb1 = 42
        self.dumber1 = 'ftw'
        self.dummy2 = 0
        self.dumb2 = 42
        self.dumber2 = 'ftw'
        self.testList = [66, 55, x, 13, 31, 55, x, 13, 31, 55]
        #x += 1


OBJECTS_NUM = 200

if __name__ == "__main__":
    allLists = []
    for i in range(0, 500):
        starttime = currentTimeMicro()
        #newlist = [Obj() for k in range(0, OBJECTS_NUM)]
        newlist = [copy.deepcopy(Obj)]*OBJECTS_NUM
        endtime = currentTimeMicro()
        elapsed = endtime - starttime
        print('Elapsed ' + str(elapsed))
        allLists.append(newlist)
    print(str(len(allLists)))
输出:

    Elapsed 18
Elapsed 5
Elapsed 4
Elapsed 3
Elapsed 3
Elapsed 3
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 4
Elapsed 6
Elapsed 4
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 4
Elapsed 5
Elapsed 2
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 4
Elapsed 5
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 9
Elapsed 4
Elapsed 3
Elapsed 6
Elapsed 3
Elapsed 5
Elapsed 4
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 14
Elapsed 3
Elapsed 5
Elapsed 4
Elapsed 6
Elapsed 4
Elapsed 4
Elapsed 5
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 7
Elapsed 4
Elapsed 5
Elapsed 4
Elapsed 7
Elapsed 3
Elapsed 4
Elapsed 6
Elapsed 4
Elapsed 6
Elapsed 3
Elapsed 3
Elapsed 6
Elapsed 3
Elapsed 5
Elapsed 4
Elapsed 5
Elapsed 3
Elapsed 4
Elapsed 5
Elapsed 4
Elapsed 5
Elapsed 3
Elapsed 4
Elapsed 5
Elapsed 6
Elapsed 3
Elapsed 4
Elapsed 5
Elapsed 3
Elapsed 6
Elapsed 3
Elapsed 4
Elapsed 5
Elapsed 3
Elapsed 5
Elapsed 4
Elapsed 4
Elapsed 7
Elapsed 4
Elapsed 6
Elapsed 3
Elapsed 3
Elapsed 14
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 6
Elapsed 3
Elapsed 3
Elapsed 4
Elapsed 4
Elapsed 8
Elapsed 3
Elapsed 6
Elapsed 3
Elapsed 4
Elapsed 5
Elapsed 4
Elapsed 6
Elapsed 3
Elapsed 3
Elapsed 6
Elapsed 3
Elapsed 8
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 3
Elapsed 5
Elapsed 4
Elapsed 5
Elapsed 3
Elapsed 4
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 14
Elapsed 4
Elapsed 14
Elapsed 3
Elapsed 3
Elapsed 13
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 3
Elapsed 5
Elapsed 5
Elapsed 3
Elapsed 4
Elapsed 5
Elapsed 4
Elapsed 5
Elapsed 4
Elapsed 4
Elapsed 7
Elapsed 4
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 4
Elapsed 6
Elapsed 4
Elapsed 4
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 4
Elapsed 4
Elapsed 5
Elapsed 3
Elapsed 5
Elapsed 4
Elapsed 4
Elapsed 6
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 4
Elapsed 5
Elapsed 4
Elapsed 4
Elapsed 3
Elapsed 3
Elapsed 5
Elapsed 4
Elapsed 6
Elapsed 3
Elapsed 3
Elapsed 7
Elapsed 3
Elapsed 4
Elapsed 4
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 4
Elapsed 5
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 5
Elapsed 4
Elapsed 5
Elapsed 3
Elapsed 12
Elapsed 6
Elapsed 3
Elapsed 6
Elapsed 3
Elapsed 2
Elapsed 5
Elapsed 3
Elapsed 7
Elapsed 4
Elapsed 3
Elapsed 6
Elapsed 4
Elapsed 4
Elapsed 4
Elapsed 4
Elapsed 3
Elapsed 3
Elapsed 6
Elapsed 3
Elapsed 6
Elapsed 3
Elapsed 4
Elapsed 5
Elapsed 3
Elapsed 6
Elapsed 3
Elapsed 3
Elapsed 6
Elapsed 3
Elapsed 6
Elapsed 4
Elapsed 4
Elapsed 5
Elapsed 4
Elapsed 6
Elapsed 4
Elapsed 3
Elapsed 5
Elapsed 4
Elapsed 6
Elapsed 4
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 3
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 4
Elapsed 4
Elapsed 5
Elapsed 3
Elapsed 5
Elapsed 4
Elapsed 3
Elapsed 7
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 7
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 4
Elapsed 5
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 5
Elapsed 4
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 6
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 6
Elapsed 3
Elapsed 3
Elapsed 5
Elapsed 12
Elapsed 5
Elapsed 3
Elapsed 4
Elapsed 5
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 7
Elapsed 4
Elapsed 3
Elapsed 5
Elapsed 4
Elapsed 5
Elapsed 2
Elapsed 3
Elapsed 5
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 7
Elapsed 4
Elapsed 3
Elapsed 5
Elapsed 4
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 4
Elapsed 5
Elapsed 2
Elapsed 4
Elapsed 3
Elapsed 4
Elapsed 5
Elapsed 3
Elapsed 5
Elapsed 2
Elapsed 3
Elapsed 5
Elapsed 4
Elapsed 4
Elapsed 3
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 3
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 6
Elapsed 3
Elapsed 5
Elapsed 4
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 5
Elapsed 4
Elapsed 4
Elapsed 3
Elapsed 4
Elapsed 6
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 7
Elapsed 3
Elapsed 3
Elapsed 6
Elapsed 3
Elapsed 5
Elapsed 4
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 4
Elapsed 2
Elapsed 2
Elapsed 5
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 3
Elapsed 4
Elapsed 7
Elapsed 3
Elapsed 5
Elapsed 2
Elapsed 2
Elapsed 4
Elapsed 2
Elapsed 4
Elapsed 3
Elapsed 2
Elapsed 5
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 5
Elapsed 4
Elapsed 3
Elapsed 5
Elapsed 4
Elapsed 3
Elapsed 4
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 6
Elapsed 3
Elapsed 4
Elapsed 4
Elapsed 3
Elapsed 5
Elapsed 4
Elapsed 5
Elapsed 3
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 3
Elapsed 6
Elapsed 3
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 7
Elapsed 4
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 5
Elapsed 3
Elapsed 4
Elapsed 3
Elapsed 3
Elapsed 3
500

当测量非常小的时间间隔时,您会受到系统多进程调度的干扰。您的程序永远不是系统上运行的唯一程序,它会频繁中断,以便将时间分配给其他进程(尽管它们可能很短)。为了获得更好的比较基础,您需要测量至少需要几毫秒的时间

为了加快列表创建速度,可以将初始化时间延迟到列表中每个对象实例的第一次使用,从而延长初始化时间。这可以通过创建一个列表类来实现,该类在对象第一次被引用时“及时”实例化对象

class ObjectList(list):
    def __init__(self,aClass,count):
        self.aClass = aClass
        self[:] = [None]*count
        
    def __getitem__(self,index):
        if isinstance(index,slice):
            return [self[i] for i in range(len(self))[index]]
        item =  super().__getitem__(index)
        if item is None:
            self[index] = item = self.aClass()
        return item
用法:

X = ObjectList(Obj,1000)

print(X[500])      # <__main__.Obj object at 0x7fa7ac805550>
print(X[502])      # <__main__.Obj object at 0x7fa7ac805748>
print(X[499:504])
# [<__main__.Obj object at 0x7fa7aaee2da0>, 
   <__main__.Obj object at 0x7fa7ac805550>, 
   <__main__.Obj object at 0x7fa7ac864a58>, 
   <__main__.Obj object at 0x7fa7ac805748>,
   <__main__.Obj object at 0x7fa7ac864a90>]

请注意,如果对象创建顺序很重要,这可能会产生一些不希望的副作用。

什么是
Obj()
做的?确实,有时Python中的垃圾收集器可能会运行,但您对Python有什么期望?@shahkalpesh请参阅链接的ideone片段-创建一个大小合适的对象(既不太小也不太大)你知道关于…-的典型情况吗禁用垃圾收集器有帮助吗?(虽然它可能有其他问题——但是如果您需要实时性能,这可能是唯一的方法)——您也可以尝试不分配新对象。@user202729有没有办法验证它是GC?(minimal Repo示例在ideone上链接)这将创建一个列表,其中引用类repeated OBJECTS_NUM times。我相信OP正在寻找该类的不同实例。感谢您指出,我已经用deepcopy功能更新了相同的代码。
X = ObjectList(Obj,1000)

print(X[500])      # <__main__.Obj object at 0x7fa7ac805550>
print(X[502])      # <__main__.Obj object at 0x7fa7ac805748>
print(X[499:504])
# [<__main__.Obj object at 0x7fa7aaee2da0>, 
   <__main__.Obj object at 0x7fa7ac805550>, 
   <__main__.Obj object at 0x7fa7ac864a58>, 
   <__main__.Obj object at 0x7fa7ac805748>,
   <__main__.Obj object at 0x7fa7ac864a90>]
No = 1000000

from timeit import timeit

t = timeit(lambda:[Obj() for _ in range(No)],number=1)
print("comprehension",t) # 0.886055106

t = timeit(lambda:ObjectList(Obj,No),number=1)
print("ObjectList",t)  # 0.013847651000000072