Python 比较不同大小的numpy阵列_Python_Arrays_Numpy

Python 比较不同大小的numpy阵列

python arrays numpy

Python 比较不同大小的numpy阵列,python,arrays,numpy,Python,Arrays,Numpy,我有两个具有xy坐标的点阵列： basic_pts = np.array([[0, 0], [1, 0], [2, 0], [0, 1], [1, 1], [0, 2]]) new_pts = np.array([[2, 2], [2, 1], [0.5, 0.5], [1.5, 0.5]]) 因此，我希望从数组new_pts中只找到满足条件的点，即basic_pts中没有x和y值较大的点。因此，结果将是 res_pts = np.array([[2, 2], [2, 1], [1.5, 0.

我有两个具有xy坐标的点阵列：

basic_pts = np.array([[0, 0], [1, 0], [2, 0], [0, 1], [1, 1], [0, 2]])
new_pts = np.array([[2, 2], [2, 1], [0.5, 0.5], [1.5, 0.5]])

因此，我希望从数组

new_pts

中只找到满足条件的点，即

basic_pts

中没有x和y值较大的点。因此，结果将是

res_pts = np.array([[2, 2], [2, 1], [1.5, 0.5]])

我有一个可行的解决方案，但由于使用列表理解，它不适合大量数据

x_cond = ([basic_pts[:, 0] > x for x in new_pts[:, 1]])
y_cond = ([basic_pts[:, 1] > y for y in new_pts[:, 1]])
xy_cond_ = np.logical_and(x_cond, y_cond)
xy_cond = np.swapaxes(xy_cond_, 0, 1)
mask = np.invert(np.logical_or.reduce(xy_cond))
res_pts = new_pts[mask]

有没有更好的方法只使用numpy而不使用列表理解来解决这个问题？

您可以使用-

正如val所指出的，创建中间

len（基本pts）

len（新pts）

数组的解决方案可能会占用大量内存。另一方面，在循环中测试

new_pts

中的每个点的解决方案可能过于耗时。我们可以通过挑选批量大小为k的产品，并使用Divakar的解决方案对批量大小为k的新产品进行测试来缩小差距：

basic_pts = np.array([[0, 0], [1, 0], [2, 0], [0, 1], [1, 1], [0, 2]])
new_pts = np.array([[2, 2], [2, 1], [0.5, 0.5], [1.5, 0.5]])
k = 2
subresults = []
for i in range(0, len(new_pts), k):
    j = min(i + k, len(new_pts))
    # Process new_pts[i:j] using Divakar's solution
    xyc = np.logical_and(
        basic_pts[:, np.newaxis, 0] > new_pts[np.newaxis, i:j, 0],
        basic_pts[:, np.newaxis, 1] > new_pts[np.newaxis, i:j, 1])
    mask = ~(xyc).any(axis=0)
    # mask indicates which points among new_pts[i:j] to use
    subresults.append(new_pts[i:j][mask])
# Concatenate subresult lists
res = np.concatenate(subresults)
print(res)
# Prints:
array([[ 2. ,  2. ],
       [ 2. ,  1. ],
       [ 1.5,  0.5]])

这也是我的想法；然而，请注意，它最终创建了一个中间数组

（len（basic_pts），len（new_pts））

中间数组，它可能会占用大量内存（OP提到了“大量数据”）@val Yeah，这对于真正庞大的数据量来说可能是个问题。谢谢你指出这一点！

basic_pts = np.array([[0, 0], [1, 0], [2, 0], [0, 1], [1, 1], [0, 2]])
new_pts = np.array([[2, 2], [2, 1], [0.5, 0.5], [1.5, 0.5]])
k = 2
subresults = []
for i in range(0, len(new_pts), k):
    j = min(i + k, len(new_pts))
    # Process new_pts[i:j] using Divakar's solution
    xyc = np.logical_and(
        basic_pts[:, np.newaxis, 0] > new_pts[np.newaxis, i:j, 0],
        basic_pts[:, np.newaxis, 1] > new_pts[np.newaxis, i:j, 1])
    mask = ~(xyc).any(axis=0)
    # mask indicates which points among new_pts[i:j] to use
    subresults.append(new_pts[i:j][mask])
# Concatenate subresult lists
res = np.concatenate(subresults)
print(res)
# Prints:
array([[ 2. ,  2. ],
       [ 2. ,  1. ],
       [ 1.5,  0.5]])