Python Numpy根据列删除唯一行

Python Numpy根据列删除唯一行,python,numpy,Python,Numpy,我试图得到一个数组,它将根据第一列删除所有唯一的行。我的阵列工作原理如下所示 [['Aaple' 'Red'] ['Aaple' '0.0'] ['Banana' 'Yellow'] ['Banana' '0.0'] ['Orange' 'Orange'] ['Pear' 'Yellow'] ['Pear' '0.0'] ['Strawberry' 'Red']] [['Aaple' 'Red'] ['Aaple' '0.0'] ['Banana' 'Yellow'] ['

我试图得到一个数组,它将根据第一列删除所有唯一的行。我的阵列工作原理如下所示

[['Aaple' 'Red']
 ['Aaple' '0.0']
 ['Banana' 'Yellow']
 ['Banana' '0.0']
 ['Orange' 'Orange']
 ['Pear' 'Yellow']
 ['Pear' '0.0']
 ['Strawberry' 'Red']]
[['Aaple' 'Red']
 ['Aaple' '0.0']
 ['Banana' 'Yellow']
 ['Banana' '0.0']
 ['Pear' 'Yellow']
 ['Pear' '0.0']]
arr = np.array(["Aaple", "Pear", "Banana"])

arr2 = np.array([["Strawberry", "Red"], ["Aaple", "Red"], ["Orange", "Orange"], ["Pear", "Yellow"], ["Banana", "Yellow"]])


arr = arr.reshape(-1,1)
zero_arr = np.zeros((len(arr), 1))
arr = np.column_stack((arr, zero_arr))
combine = np.vstack((arr2, arr))
sort = combine[combine[:,0].argsort()]
#Where the first array printed is sort
我希望它看起来像这样

[['Aaple' 'Red']
 ['Aaple' '0.0']
 ['Banana' 'Yellow']
 ['Banana' '0.0']
 ['Orange' 'Orange']
 ['Pear' 'Yellow']
 ['Pear' '0.0']
 ['Strawberry' 'Red']]
[['Aaple' 'Red']
 ['Aaple' '0.0']
 ['Banana' 'Yellow']
 ['Banana' '0.0']
 ['Pear' 'Yellow']
 ['Pear' '0.0']]
arr = np.array(["Aaple", "Pear", "Banana"])

arr2 = np.array([["Strawberry", "Red"], ["Aaple", "Red"], ["Orange", "Orange"], ["Pear", "Yellow"], ["Banana", "Yellow"]])


arr = arr.reshape(-1,1)
zero_arr = np.zeros((len(arr), 1))
arr = np.column_stack((arr, zero_arr))
combine = np.vstack((arr2, arr))
sort = combine[combine[:,0].argsort()]
#Where the first array printed is sort
它将从第一列中删除唯一值。我当前的代码如下所示

[['Aaple' 'Red']
 ['Aaple' '0.0']
 ['Banana' 'Yellow']
 ['Banana' '0.0']
 ['Orange' 'Orange']
 ['Pear' 'Yellow']
 ['Pear' '0.0']
 ['Strawberry' 'Red']]
[['Aaple' 'Red']
 ['Aaple' '0.0']
 ['Banana' 'Yellow']
 ['Banana' '0.0']
 ['Pear' 'Yellow']
 ['Pear' '0.0']]
arr = np.array(["Aaple", "Pear", "Banana"])

arr2 = np.array([["Strawberry", "Red"], ["Aaple", "Red"], ["Orange", "Orange"], ["Pear", "Yellow"], ["Banana", "Yellow"]])


arr = arr.reshape(-1,1)
zero_arr = np.zeros((len(arr), 1))
arr = np.column_stack((arr, zero_arr))
combine = np.vstack((arr2, arr))
sort = combine[combine[:,0].argsort()]
#Where the first array printed is sort
我可以通过添加
x=sort[:-1][sort[1::][1::][sort[:-1]]]
获得我想要保留的行
['Aaple''香蕉''梨']
,接下来的步骤是什么?

它可能更容易使用:

结果:

array([['Aaple', 'Red'],
       ['Aaple', '0.0'],
       ['Banana', 'Yellow'],
       ['Banana', '0.0'],
       ['Pear', 'Yellow'],
       ['Pear', '0.0']], dtype=object)

我试图在Numpy中完成这一切,我有一个很大的数据集,使用熊猫需要花费太长时间,使用Numpy不可能吗@stef@AndrewHorowitz由于pandas在引擎盖下使用numpy,因此可能在纯numpy中进行此操作,但我认为它不会比pandas快太多,因为这里涉及的操作是在numpy阵列上进行的(可能我错了)