Python numba输出的差异_Python_Numpy_Numba

Python numba输出的差异

python numpy

Python numba输出的差异,python,numpy,numba,Python,Numpy,Numba,在我的学习工作中，我实现了一个基本的近邻搜索。事实上，基本的numpy实现工作得很好，但是仅仅添加了'@jit'装饰器（在Numba中编译），输出就不同了（由于未知的原因，它最终复制了一些邻居…）以下是基本算法： import numpy as np from numba import jit @jit(nopython=True) def knn(p, points, k): '''Find the k nearest neighbors (brute force) of the

在我的学习工作中，我实现了一个基本的近邻搜索。事实上，基本的numpy实现工作得很好，但是仅仅添加了'@jit'装饰器（在Numba中编译），输出就不同了（由于未知的原因，它最终复制了一些邻居…）

以下是基本算法：

import numpy as np
from numba import jit

@jit(nopython=True)
def knn(p, points, k):
    '''Find the k nearest neighbors (brute force) of the point p
    in the list points (each row is a point)'''

    n = p.size  # Lenght of the points
    M = points.shape[0]  # Number of points
    neighbors = np.zeros((k,n))
    distances = 1e6*np.ones(k)

    for i in xrange(M):
        d = 0
        pt = points[i, :]  # Point to compare
        for r in xrange(n):  # For each coordinate
            aux = p[r] - pt[r]
            d += aux * aux
        if d < distances[k-1]:  # We find a new neighbor
            pos = k-1
            while pos>0 and d<distances[pos-1]:  # Find the position
                pos -= 1
            pt = points[i, :]
            # Insert neighbor and distance:
            neighbors[pos+1:, :] = neighbors[pos:-1, :]
            neighbors[pos, :] = pt
            distances[pos+1:] = distances[pos:-1]
            distances[pos] = d

    return neighbors, distances

如果没有@jit decorator，您将得到正确的答案：

In [1]: distances
Out[1]: array([ 0.3933974 ,  0.44754336,  0.54548715,  0.55619749,  0.5657846 ])

但Numba汇编给出了奇怪的结果：

Out[2]: distances
Out[2]: array([ 0.3933974 ,  0.44754336,  0.54548715,  0.54548715,  0.54548715])

有人能帮忙吗？我不知道为什么会这样

谢谢。

我认为问题在于，当一个切片重叠时，Numba处理将一个切片写入另一个切片的方式与不重叠时不同。我不熟悉numpy的内部结构，但可能有特殊的逻辑来处理像这样的易失性内存操作，这在Numba中是不存在的。更改以下行，jit装饰器的结果将与普通python版本一致：

neighbors[pos+1:, :] = neighbors[pos:-1, :].copy()
...
distances[pos+1:] = distances[pos:-1].copy()

谢谢@JoshAdel！这对我有用。我之前验证过Numpy中的重叠切片不会导致问题，但出于某些原因，Numba将其转换为不同的算法。。。在所有情况下，只复制一些邻居而不复制其他邻居是令人厌倦的。。。再次感谢！PD：我是python的粉丝，但诸如此类的事情让我认真思考学习Julia…@MarioGonzález我鼓励您将您的示例作为问题发布在Numba github问题跟踪程序上。开发团队通常反应迅速，希望了解bug或意外行为。谢谢@JoshAdel的建议。已发布。您可能对scipy实现感兴趣。@Ophion谢谢您的提示。我一直在玩sklearn的KDTree实现（我想它们是类似的），它们很适合为将来的多个查询点预处理数据。在我的工作中，我需要发现邻居一直在更改点列表（在图像处理方面），这种类型的实现变得太慢了。当空间维度较大（例如大于25）时，KDTree实现似乎并不比暴力更好。

neighbors[pos+1:, :] = neighbors[pos:-1, :].copy()
...
distances[pos+1:] = distances[pos:-1].copy()