Python 将此非唯一键操作矢量化_Python_Numpy_Vectorization

Python 将此非唯一键操作矢量化

python numpy

Python 将此非唯一键操作矢量化,python,numpy,vectorization,Python,Numpy,Vectorization,我有一个名为test的非唯一原始数据。使用此输入，我希望创建一个输出向量，以及一组获得非零输出的行，以及包含其输出的数据 import numpy as np rows = np.array([3, 4]) test = np.array([1, 3, 3, 4, 5]) data = np.array([-1, 2]) 我的预期输出是一个shapetest.shape的向量输出中的每个元素：如果元素位于索引为i的行，输出[i]=数据[i] 否则，输出[i]=0 换句话说，下面生成了

我有一个名为

test

的非唯一原始数据。使用此输入，我希望创建一个输出向量，以及一组获得非零输出的

行

，以及包含其输出的

数据
import numpy as np

rows = np.array([3, 4])
test = np.array([1, 3, 3, 4, 5])
data = np.array([-1, 2])

我的预期输出是一个shapetest.shape的向量
输出中的每个元素
：

如果元素
位于索引为i的行
，输出[i]=数据[i]

否则，输出[i]=0

换句话说，下面生成了我的输出
output = np.zeros(test.shape)
for i, val in enumerate(rows):
    output[test == val] = data[i]

有没有办法将其矢量化？
这是一种基于-
最后几行可能有两种选择，如果你挖一行的话-
out = data[idx] * (rows[idx] == test) # skips using `invalid_mask`

out = np.where(invalid_mask, 0, data[idx])

只有当测试
和行
由不太大的整数组成时（也可以是非负数，但如果需要，可以放宽），此方法才有效。但它很快：
>>> rows = np.array([3, 4])
>>> test = np.array([1, 3, 3, 4, 5])                                                                                        
>>> data = np.array([-1, 2])
>>> 
>>> limit = 1<<20
>>> assert all(a.dtype in map(np.dtype, np.sctypes['int']) for a in  (rows, test))
>>> assert np.all(rows>=0) and np.all(test>=0)
>>> mx = np.maximum(np.max(rows), np.max(test)) + 1
>>> assert mx <= limit
>>> lookup = np.empty((mx,), data.dtype)
>>> lookup[test] = 0
>>> lookup[rows] = data
>>> result = lookup[test]
>>> result
array([ 0, -1, -1,  2,  0])

>rows=np.array（[3,4]）
>>>test=np.数组（[1,3,3,4,5]）
>>>data=np.array（[-1,2]）
>>> 
>>>limit=1>为in（行，测试）断言所有（映射中的a.dtype（np.dtype，np.sctypes['int']））
>>>断言np.all（行>=0）和np.all（测试>=0）
>>>mx=np.max（np.max（行），np.max（测试））+1
>>>assert mx>>lookup=np.empty（（mx，），data.dtype）
>>>查找[测试]=0
>>>查找[行]=数据
>>>结果=查找[测试]
>>>结果
数组（[0，-1，-1，2，0]）

>>> rows = np.array([3, 4])
>>> test = np.array([1, 3, 3, 4, 5])                                                                                        
>>> data = np.array([-1, 2])
>>> 
>>> limit = 1<<20
>>> assert all(a.dtype in map(np.dtype, np.sctypes['int']) for a in  (rows, test))
>>> assert np.all(rows>=0) and np.all(test>=0)
>>> mx = np.maximum(np.max(rows), np.max(test)) + 1
>>> assert mx <= limit
>>> lookup = np.empty((mx,), data.dtype)
>>> lookup[test] = 0
>>> lookup[rows] = data
>>> result = lookup[test]
>>> result
array([ 0, -1, -1,  2,  0])