获取两个二维numpy数组的相交行_Numpy_Python

获取两个二维numpy数组的相交行

numpy python

获取两个二维numpy数组的相交行,numpy,python,Numpy,Python,我想得到两个2D numpy数组的相交（公共）行。例如，如果以下数组作为输入传递： array([[1, 4], [2, 5], [3, 6]]) array([[1, 4], [3, 6], [7, 8]]) 输出应为： array([[1, 4], [3, 6]) 我知道如何处理循环。我正在寻找一种Pythonic/Numpy方法来实现这一点。您可以使用Python的集合： >>> import n

我想得到两个2D numpy数组的相交（公共）行。例如，如果以下数组作为输入传递：

array([[1, 4],
       [2, 5],
       [3, 6]])

array([[1, 4],
       [3, 6],
       [7, 8]])

输出应为：

array([[1, 4],
       [3, 6])

我知道如何处理循环。我正在寻找一种Pythonic/Numpy方法来实现这一点。

您可以使用Python的集合：

>>> import numpy as np
>>> A = np.array([[1,4],[2,5],[3,6]])
>>> B = np.array([[1,4],[3,6],[7,8]])
>>> aset = set([tuple(x) for x in A])
>>> bset = set([tuple(x) for x in B])
>>> np.array([x for x in aset & bset])
array([[1, 4],
       [3, 6]])

正如Rob Cowie指出的那样，这可以更简洁地作为

np.array([x for x in set(tuple(x) for x in A) & set(tuple(x) for x in B)])

也许有一种方法可以做到这一点，而不需要从数组到元组来回转换，但我现在还没有想到。

对于短数组，使用集合可能是最清晰、可读性最强的方法

另一种方法是使用。但是，您必须欺骗它，使其将行视为单个值。。。这让事情变得不那么可读

import numpy as np

A = np.array([[1,4],[2,5],[3,6]])
B = np.array([[1,4],[3,6],[7,8]])

nrows, ncols = A.shape
dtype={'names':['f{}'.format(i) for i in range(ncols)],
       'formats':ncols * [A.dtype]}

C = np.intersect1d(A.view(dtype), B.view(dtype))

# This last bit is optional if you're okay with "C" being a structured array...
C = C.view(A.dtype).reshape(-1, ncols)

对于大型阵列，这应该比使用集合快得多。

使用结构化阵列实现这一点的另一种方法：

>>> a = np.array([[3, 1, 2], [5, 8, 9], [7, 4, 3]])
>>> b = np.array([[2, 3, 0], [3, 1, 2], [7, 4, 3]])
>>> av = a.view([('', a.dtype)] * a.shape[1]).ravel()
>>> bv = b.view([('', b.dtype)] * b.shape[1]).ravel()
>>> np.intersect1d(av, bv).view(a.dtype).reshape(-1, a.shape[1])
array([[3, 1, 2],
       [7, 4, 3]])

为清楚起见，结构化视图如下所示：

>>> a.view([('', a.dtype)] * a.shape[1])
array([[(3, 1, 2)],
       [(5, 8, 9)],
       [(7, 4, 3)]],
       dtype=[('f0', '<i8'), ('f1', '<i8'), ('f2', '<i8')])

>>a.view（[（''，a.dtype）]*a.shape[1]）
数组（[（3,1,2）]，
[(5, 8, 9)],
[(7, 4, 3)]],
dtype=[（'f0'，'我不明白为什么没有建议的纯numpy方法来实现这一点。因此我找到了一个使用numpy广播的方法。基本思想是通过轴交换将其中一个数组转换为3d。让我们构建两个数组：
a=np.random.randint(10, size=(5, 3))
b=np.zeros_like(a)
b[:4,:]=a[np.random.randint(a.shape[0], size=4), :]

在我的跑步中，它给出了：
a=array([[5, 6, 3],
   [8, 1, 0],
   [2, 1, 4],
   [8, 0, 6],
   [6, 7, 6]])
b=array([[2, 1, 4],
   [2, 1, 4],
   [6, 7, 6],
   [5, 6, 3],
   [0, 0, 0]])

步骤如下（阵列可以互换）：
在具有两行用于减少已用内存的函数中（如果错误，请更正我）：
这为我的示例提供了一个结果：
result=array([[5, 6, 3],
       [2, 1, 4],
       [6, 7, 6]])

这比集合解更快，因为它只使用简单的numpy运算，同时不断减少维数，非常适合两个大矩阵。我想我可能在评论中犯了错误，因为我是通过实验和直觉得到答案的。列相交的等价物可以通过变换数组来找到s或通过稍微更改步骤。此外，如果需要重复项，则必须跳过“/”中的步骤。可以编辑该函数以仅返回索引的布尔数组，这对我来说很方便，同时尝试使用相同的向量获取不同的数组索引。投票答案和我的答案的基准（每个维度中的元素数量对选择内容起作用）：
代码：
结果如下：
Small arrays:
     Voted answer: 7.47108459473e-05
     Proposed answer: 2.47001647949e-05
Big column arrays:
     Voted answer: 0.00198730945587
     Proposed answer: 0.0560171294212
Big row arrays:
     Voted answer: 0.00500325918198
     Proposed answer: 0.000308241844177
Big arrays:
     Voted answer: 0.000864889621735
     Proposed answer: 0.00257176160812

下面的结论是，如果你必须比较2个2d点的大2d数组，那么就使用投票答案。如果你在所有维度上都有大矩阵，投票答案绝对是最好的。因此，这取决于你每次选择什么
np.array(set(map(tuple, b)).difference(set(map(tuple, a))))

这也可以奏效
A = np.array([[1,4],[2,5],[3,6]])
B = np.array([[1,4],[3,6],[7,8]])

def matching_rows(A,B):
  matches=[i for i in range(B.shape[0]) if np.any(np.all(A==B[i],axis=1))]
  if len(matches)==0:
    return B[matches]
  return np.unique(B[matches],axis=0)

>>> matching_rows(A,B)
array([[1, 4],
       [3, 6]])


当然，这假设行的长度都相同。
没有索引
import numpy as np

A=np.array([[1, 4],
       [2, 5],
       [3, 6]])

B=np.array([[1, 4],
       [3, 6],
       [7, 8]])

intersetingRows=[(B==irow).all(axis=1).any() for irow in A]
print(A[intersetingRows])

拜访
带有索引
拜访
我们可以使用广播创建一个布尔掩码，然后使用广播来过滤数组a
中的行，这些行也存在于数组B

A = np.array([[1,4],[2,5],[3,6]])
B = np.array([[1,4],[3,6],[7,8]])

m = (A[:, None] == B).all(-1).any(1)


我同意。找不到任何“本机”的简单方法。一行版本可能是common=set（tuple（I）表示A中的I）和set（tuple（I）表示B中的I）
如果要使用set，可以使用交叉函数：set.intersection（aset，bset）@rhinoinrepose-使用该函数的速度比和np.intersect1d（A，B）快整形（-1，NCOL）< /代码>获得相同的结果。划掉前面的注释……你是对的……它应该在所有的情况下工作。实际上，不，它不起作用。（我早意识到了，然后忘记了！）没有结构化的dType，它不把事物看作行，只是“原始”的数字。考虑一些东西，比如<代码> A= NP.数组（[4]，1]，[2,5]，[3,6]]）
和B=np.array（[1,4]，[3,6]，[7,8]]）
@Karthik-您是否得到了值错误：格式为的零长度字段名？我使用了新样式的字符串格式。在python2.6上，您需要将名称：['f{0}.格式改为'names'：['f}“.format…
您可以将该数据类型行替换为：dtype=（“，”.join（[str（A.dtype）]*ncols））
。未指定名称，因此它默认为f0、f1等。获取值错误：“axis”条目超出了“c=np.prod（c，axis=2）”行奇怪的是，通过发出命令“c=np.swapaxes（A）”，c应该有3个维度[：，：，无]、1,2）==b'..非常简洁。还可以通过小的修改返回索引。np.where（np.prod（np.swapaxes（Array_a[：，：，无]、1,2）==Array_b，axis=2）。astype（bool））
A = np.array([[1,4],[2,5],[3,6]])
B = np.array([[1,4],[3,6],[7,8]])

def matching_rows(A,B):
  matches=[i for i in range(B.shape[0]) if np.any(np.all(A==B[i],axis=1))]
  if len(matches)==0:
    return B[matches]
  return np.unique(B[matches],axis=0)

>>> matching_rows(A,B)
array([[1, 4],
       [3, 6]])


import numpy as np

A=np.array([[1, 4],
       [2, 5],
       [3, 6]])

B=np.array([[1, 4],
       [3, 6],
       [7, 8]])

intersetingRows=[(B==irow).all(axis=1).any() for irow in A]
print(A[intersetingRows])

def intersect2D(Array_A, Array_B):
"""
Find row intersection between 2D numpy arrays, a and b.
"""

# ''' Using Tuple ''' #
intersectionList = list(set([tuple(x) for x in Array_A for y in Array_B  if(tuple(x) == tuple(y))]))
print ("intersectionList = \n",intersectionList)

# ''' Using Numpy function "array_equal" ''' #
""" This method is valid for an ndarray """
intersectionList = list(set([tuple(x) for x in Array_A for y in Array_B  if(np.array_equal(x, y))]))
print ("intersectionList = \n",intersectionList)

# ''' Using set and bitwise and '''
intersectionList = [list(y) for y in (set([tuple(x) for x in Array_A]) & set([tuple(x) for x in Array_B]))]
print ("intersectionList = \n",intersectionList)

return intersectionList

def intersect2D(Array_A, Array_B):
  """
  Find row intersection between 2D numpy arrays, a and b.
  Returns another numpy array with shared rows and index of items in A & B arrays
  """
  # [[IDX], [IDY], [value]] where Equal
  # ''' Using Tuple ''' #
  IndexEqual = np.asarray([(i, j, x) for i,x in enumerate(Array_A) for j, y in enumerate (Array_B)  if(tuple(x) == tuple(y))]).T
  
  # ''' Using Numpy array_equal ''' #
  IndexEqual = np.asarray([(i, j, x) for i,x in enumerate(Array_A) for j, y in enumerate (Array_B)  if(np.array_equal(x, y))]).T
  
  idx, idy, intersectionList = (IndexEqual[0], IndexEqual[1], IndexEqual[2]) if len(IndexEqual) != 0 else ([], [], [])

  return intersectionList, idx, idy

A = np.array([[1,4],[2,5],[3,6]])
B = np.array([[1,4],[3,6],[7,8]])

m = (A[:, None] == B).all(-1).any(1)

>>> A[m]

array([[1, 4],
       [3, 6]])