Python速度和内存_Python_Python 3.x_Performance

Python速度和内存

python python-3.x performance

Python速度和内存,python,python-3.x,performance,Python,Python 3.x,Performance,您好，我是Python的新手，我有一个简单的函数，如下所示： def normal_list(p, data, n): cost = 0 cost_list = [] list = [] clappend=cost_list.append listappend=list.append for i in range(n): x = np.random.choice(data, p=p) if (len(list) == 0): listapp

您好，我是Python的新手，我有一个简单的函数，如下所示：

def normal_list(p, data, n):
  cost = 0
  cost_list = []
  list = []
  clappend=cost_list.append
  listappend=list.append
  for i in range(n):
    x = np.random.choice(data, p=p)
    if (len(list) == 0):
        listappend(x)
    else:
        for i in list:
            if i == x:
                clappend(cost)
            elif i == list[-1]:
                    listappend(x)
            cost += 1
        cost = 0
  return cost_list

其中p是一个概率列表，数据是一个数据列表，在几乎所有情况下，它都是一个数字为1-100的列表我必须加快这个速度，因为我必须在n=100000时使用这个函数，然后它需要很长的时间，并在cost_列表的内存中结束错误。谢谢你的建议，我刚刚发现了我的错误。我把事情搞砸了，因为我在列表中找到x后没有中断循环。所以再次感谢，使用生成器是个好主意

如果n非常大，考虑将函数分裂并将它们转换为。使用

yield

代替return将“动态”生成结果，而不是在返回之前收集所有内容，从而节省内存分配

这可能不是上述生成器的完整功能实现，但这只是一个开始：

def sub1(lst, x):
    cost = 0
    for e in lst:
        cost += 1
        if e == x:
            yield (cost, None)
        elif e == lst[-1]:
            yield (None, x)

def normal_list(p, data, n):
    lst = []
    for i in range(n):
        x = np.random.choice(data, p=p)
        if len(lst) == 0:
            yield x
        else:
            for res in sub1(lst, x):
                if res[0] is not None:
                    yield res[0]
                else:
                    lst += res[1]

虽然这看起来更像是内存问题，而不是速度问题，但您可以尝试在更快的Python环境中使用。（第一点应足够）

您可以取消外部条件检查。您的

列表仅为空一次：在第一次迭代之前。为什么不立即初始化并跳过外部循环的第一次迭代和检查：
def normal_list(p, data, n):
    cost = 0
    cost_list = []
    _list = [np.random.choice(data, p=p)]
    #clappend=cost_list.append # Don't do this! It's confusing!
    #listappend=_list.append(x) # Don't do this! It's confusing!
    for i in range(1,n):
        ...

哦，请不要调用变量列表
list（）
是一个内置的列表构造函数。它的用意是什么？这里似乎至少有二次O（n^2）复杂度。看起来你们可以把它简化为线性O（n），但老实说，我不明白你们的函数应该做什么。这通常是一个比性能问题更大的问题。i==list[-1]
最终在外部循环的每次迭代中都是如此。看起来您总是将x
附加到列表
，除非它是当前迭代的编号。这是真的吗？如果是这样，则可以对代码进行进一步的主要优化。只有在列表中没有值为x的变量时，才应将x添加到列表中。。它计算需要多少次迭代才能找到这个值。