Python 加速循环以使用另一个数组中最接近的值填充数组_Python_Arrays_Performance_Loops_Numpy

Python 加速循环以使用另一个数组中最接近的值填充数组

python arrays performance loops numpy

Python 加速循环以使用另一个数组中最接近的值填充数组,python,arrays,performance,loops,numpy,Python,Arrays,Performance,Loops,Numpy,我有一段代码需要尽可能地优化，因为我必须运行它几千次它所做的是在给定数组的子列表中为随机浮点查找最近的浮点，并将相应的浮点（即：具有相同索引）存储在该数组的另一个子列表中。它重复该过程，直到存储的浮点总数达到某个限制以下是MWE以使其更清晰： import numpy as np # Define array with two sub-lists. a = [np.random.uniform(0., 100., 10000), np.random.random(10000)] # In

我有一段代码需要尽可能地优化，因为我必须运行它几千次

它所做的是在给定数组的子列表中为随机浮点查找最近的浮点，并将相应的浮点（即：具有相同索引）存储在该数组的另一个子列表中。它重复该过程，直到存储的浮点总数达到某个限制

以下是

MWE

以使其更清晰：

import numpy as np

# Define array with two sub-lists.
a = [np.random.uniform(0., 100., 10000), np.random.random(10000)]

# Initialize empty final list.
b = []

# Run until the condition is met.
while (sum(b) < 10000):

    # Draw random [0,1) value.
    u = np.random.random()
    # Find closest value in sub-list a[1].
    idx = np.argmin(np.abs(u - a[1]))
    # Store value located in sub-list a[0].
    b.append(a[0][idx])

将numpy导入为np
#使用两个子列表定义数组。
a=[np.随机.均匀（0,100,10000），np.随机.随机（10000）]
#初始化空的最终列表。
b=[]
#运行，直到满足条件。
而（总和（b）<10000）：
#随机抽取[0,1]值。
u=np.random.random（）
#在子列表a[1]中查找最接近的值。
idx=np.argmin（np.abs（u-a[1]））
#存储位于子列表a[0]中的值。
b、 追加（a[0][idx]）

代码相当简单，但我还没有找到加快速度的方法。我尝试适应了我不久前提出的一个类似问题中给出的很棒（而且非常快）的方法，但没有效果。

对引用数组进行排序

允许

log（n）

查找，而无需浏览整个列表。（例如，使用

bisect

查找最近的元素）

首先，我将a[0]和a[1]反转以简化排序：

a = np.sort([np.random.random(10000), np.random.uniform(0., 100., 10000)])

现在，a按[0]的顺序排序，这意味着如果要查找与任意数字最接近的值，可以从对分开始：

while (sum(b) < 10000):
    # Draw random [0,1) value.
    u = np.random.random()
    # Find closest value in sub-list a[0].
    idx = bisect.bisect(a[0], u)
    # now, idx can either be idx or idx-1
    if idx is not 0 and np.abs(a[0][idx] - u) > np.abs(a[0][idx - 1] - u):
        idx = idx - 1
    # Store value located in sub-list a[1].
    b.append(a[1][idx])

while（总和（b）<10000）：
#随机抽取[0,1]值。
u=np.random.random（）
#在子列表a[0]中查找最接近的值。
idx=二等分。二等分（a[0]，u）
#现在，idx可以是idx或idx-1
如果idx不是0且np.abs（a[0][idx]-u）>np.abs（a[0][idx-1]-u）：
idx=idx-1
#存储子列表a[1]中的值。
b、 附加（a[1][idx]）

对引用数组进行排序

允许

log（n）

查找，而无需浏览整个列表。（例如，使用

bisect

查找最近的元素）

首先，我将a[0]和a[1]反转以简化排序：

a = np.sort([np.random.random(10000), np.random.uniform(0., 100., 10000)])

现在，a按[0]的顺序排序，这意味着如果要查找与任意数字最接近的值，可以从对分开始：

while (sum(b) < 10000):
    # Draw random [0,1) value.
    u = np.random.random()
    # Find closest value in sub-list a[0].
    idx = bisect.bisect(a[0], u)
    # now, idx can either be idx or idx-1
    if idx is not 0 and np.abs(a[0][idx] - u) > np.abs(a[0][idx - 1] - u):
        idx = idx - 1
    # Store value located in sub-list a[1].
    b.append(a[1][idx])

while（总和（b）<10000）：
#随机抽取[0,1]值。
u=np.random.random（）
#在子列表a[0]中查找最接近的值。
idx=二等分。二等分（a[0]，u）
#现在，idx可以是idx或idx-1
如果idx不是0且np.abs（a[0][idx]-u）>np.abs（a[0][idx-1]-u）：
idx=idx-1
#存储子列表a[1]中的值。
b、 附加（a[1][idx]）

一个明显的优化-不要在每次迭代中重新计算总和，而是累积它

b_sum = 0
while b_sum<10000:
    ....
    idx = np.argmin(np.abs(u - a[1]))
    add_val = a[0][idx]
    b.append(add_val)
    b_sum += add_val

它可能会节省一些运行时间——尽管我认为这不会有多大影响。

一个明显的优化——不要在每次迭代中重新计算总和，而是累积它

b_sum = 0
while b_sum<10000:
    ....
    idx = np.argmin(np.abs(u - a[1]))
    add_val = a[0][idx]
    b.append(add_val)
    b_sum += add_val

它可能会节省一些运行时间——尽管我不认为这会有多大影响。

用cython编写。这将为高迭代操作带来更多好处

好的，这里有一个稍微偏左的字段建议。据我所知，您只是试图从

a[0]

中的元素中进行统一采样，直到得到一个总和超过某个限制的列表

虽然在内存方面花费更大，但我认为您可能会发现，首先从

a[0]

生成一个大的随机样本，然后获取累积和并找出它首先超出您的限制的位置，速度要快得多

例如：

import numpy as np

# array of reference float values, equivalent to a[0]
refs = np.random.uniform(0, 100, 10000)

def fast_samp_1(refs, lim=10000, blocksize=10000):

    # sample uniformally from refs
    samp = np.random.choice(refs, size=blocksize, replace=True)
    samp_sum = np.cumsum(samp)

    # find where the cumsum first exceeds your limit
    last = np.searchsorted(samp_sum, lim, side='right')
    return samp[:last + 1]

    # # if it's ok to be just under lim rather than just over then this might
    # # be quicker
    # return samp[samp_sum <= lim]

请注意，连接数组的速度非常慢，因此最好将

blocksize

设置得足够大，以合理地确保单个块的总和>=达到您的极限，而不会过大

更新我对原始函数做了一些修改，使其语法更接近我的语法

def orig_samp(refs, lim=10000):

    # Initialize empty final list.
    b = []

    a1 = np.random.random(10000)

    # Run until the condition is met.
    while (sum(b) < lim):

        # Draw random [0,1) value.
        u = np.random.random()
        # Find closest value in sub-list a[1].
        idx = np.argmin(np.abs(u - a1))
        # Store value located in sub-list a[0].
        b.append(refs[idx])

    return b

这相当于快了3个数量级。你可以通过将块大小减少一小部分来做得更好-你基本上希望它比你得到的阵列的长度更大。在这种情况下，你知道平均输出大约有200个元素长，因为0到100之间的所有实数的平均值是50，10000/50=200

更新2 很容易获得加权样本而不是统一样本-您只需将

p=

参数传递给

np.random.choice

：

def weighted_fast_samp(refs, weights=None, lim=10000, blocksize=10000):

    samp = np.random.choice(refs, size=blocksize, replace=True, p=weights)
    samp_sum = np.cumsum(samp)

    # is the sum of our current block of samples >= lim?
    while samp_sum[-1] < lim:

        # if not, we'll sample another block and try again until it is
        newsamp = np.random.choice(refs, size=blocksize, replace=True, 
                                   p=weights)
        samp = np.hstack((samp, newsamp))
        samp_sum = np.hstack((samp_sum, np.cumsum(newsamp) +  samp_sum[-1]))

    last = np.searchsorted(samp_sum, lim, side='right')
    return samp[:last + 1]

def-weighted\u-fast\u-samp（参考文献，权重=无，lim=10000，块大小=10000）：
samp=np.random.choice（参考，大小=块大小，替换=真，p=权重）
samp_sum=np.cumsum（samp）
#当前样本块的总和是否>=lim？
而samp_sum[-1]

好的，这里有一个稍微偏左的字段建议。据我所知，您只是尝试从

a[0]

中的元素进行统一采样，直到得到一个总和超过某个限制的列表

虽然在内存方面花费更大，但我认为您可能会发现，首先从

a[0]

生成一个大的随机样本，然后获取累积和并找出它首先超出您的限制的位置，速度要快得多

例如：

import numpy as np

# array of reference float values, equivalent to a[0]
refs = np.random.uniform(0, 100, 10000)

def fast_samp_1(refs, lim=10000, blocksize=10000):

    # sample uniformally from refs
    samp = np.random.choice(refs, size=blocksize, replace=True)
    samp_sum = np.cumsum(samp)

    # find where the cumsum first exceeds your limit
    last = np.searchsorted(samp_sum, lim, side='right')
    return samp[:last + 1]

    # # if it's ok to be just under lim rather than just over then this might
    # # be quicker
    # return samp[samp_sum <= lim]

请注意，连接数组的速度非常慢，因此最好将

blocksize

设置得足够大，以合理地确保单个块的总和>=达到您的极限，而不会过大

更新我已经适应了