Python 每次使用不同的切片范围沿不同的dim对数组求和
假设我有一个形状为Python 每次使用不同的切片范围沿不同的dim对数组求和,python,arrays,numpy,indexing,Python,Arrays,Numpy,Indexing,假设我有一个形状为(3,10,3)的数组b,另一个形状为(3,10,3)的数组v=[8,9,4],见下文。对于b中的3个数组(10,3),我需要对由v确定的行数求和,即对于I=0,1,2我需要得到np.sum(b[I,0:v[I]],axis=0)。我的解决方案(如下所示)使用for循环,我想这是低效的。我想知道是否有一种有效的(矢量化的)方法来实现我上面所描述的 注意:我的实际数组有更多的维度,这些数组只是为了说明 v=np.数组([8,9,4]) b=np.数组([[0,1,0.]), [0
(3,10,3)
的数组b
,另一个形状为(3,10,3)
的数组v=[8,9,4]
,见下文。对于b
中的3个数组(10,3)
,我需要对由v
确定的行数求和,即对于I=0,1,2
我需要得到np.sum(b[I,0:v[I]],axis=0)
。我的解决方案(如下所示)使用for循环,我想这是低效的。我想知道是否有一种有效的(矢量化的)方法来实现我上面所描述的
注意:我的实际数组有更多的维度,这些数组只是为了说明
v=np.数组([8,9,4])
b=np.数组([[0,1,0.]),
[0., 0., 1.],
[0., 0., 1.],
[0., 0., 1.],
[1., 0., 0.],
[1., 0., 0.],
[0., 0., 1.],
[1., 0., 0.],
[0., 1., 0.],
[1., 0., 0.]],
[[0., 0., 1.],
[0., 1., 0.],
[1., 0., 0.],
[0., 0., 1.],
[1., 0., 0.],
[1., 0., 0.],
[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.],
[0., 1., 0.]],
[[1., 0., 0.],
[1., 0., 0.],
[1., 0., 0.],
[0., 0., 1.],
[0., 1., 0.],
[0., 1., 0.],
[1., 0., 0.],
[1., 0., 0.],
[0., 0., 1.],
[1., 0., 0.]]])
n=v.shape[0]
vv=np.零([n,n])
对于范围(n)中的i:
vv[i]=np.和(b[i,0:v[i]],轴=0)
输出:
vv
数组([[3,1,4.],
[4., 2., 3.],
[3., 0., 1.]])
编辑:
下面是数组v和b的一个实际示例
v=np.random.randint(0300,size=(32,98,3))
b=np.零([98,3300,3])
对于范围(3)中的i:
对于范围(98)内的j:
b[j,i]=np.随机多项式(1[1./3,1./3,1./3,1./3],300)
v、 形状
Out[292]:(32,98,3)
b、 形状
Out[293]:(98,3300,3)
我需要做与前面相同的事情,因此最终结果是一个shape
(32,98,3,3)
数组。请注意,我必须在每次迭代时执行上述操作,这就是我寻找高效实现的原因。以下函数允许使用由开始和停止数组指示的不同切片来减少给定轴。它使用了引擎盖下以及适当重塑的输入数组和索引版本。它避免了不必要的计算,但分配了一个两倍于最终输出数组大小的中间数组(但丢弃值的计算不是ops)
对于OP中的示例,可按以下方式使用:
# 1. example:
b = np.random.randint(0, 1000, size=(3, 10, 3))
v = np.random.randint(-9, 10, size=3) # Indexing into `b.shape[1]`.
result = sliced_reduce(b, np.zeros_like(v), v, np.add)
# 2. example:
b = np.random.randint(0, 1000, size=(98, 3, 300, 3))
v = np.random.randint(-299, 300, size=(32, 98, 3)) # Indexing into `b.shape[2]`; one additional leading dimension for `v`.
result = sliced_reduce(b, np.zeros_like(v), v, np.add, axis=2)
# 1. example:
b = np.random.randint(0, 1000, size=(3, 10, 3))
v = np.random.randint(-9, 10, size=3) # Indexing into `b.shape[1]`.
result = sliced_sum(b, np.zeros_like(v), v)
# 2. example:
b = np.random.randint(0, 1000, size=(98, 3, 300, 3))
v = np.random.randint(-299, 300, size=(32, 98, 3)) # Indexing into `b.shape[2]`; one additional leading dimension for `v`.
result = sliced_sum(b, np.zeros_like(v), v, axis=2)
# 1. example:
b = np.random.randint(0, 1000, size=(3, 10, 3))
v = np.random.randint(-9, 10, size=3) # Indexing into `b.shape[1]`.
result = sliced_sum_numba(b, np.zeros_like(v), v)
# 2. example:
b = np.random.randint(0, 1000, size=(98, 3, 300, 3))
v = np.random.randint(-299, 300, size=(32, 98, 3)) # Indexing into `b.shape[2]`; one additional leading dimension for `v`.
result = sliced_sum_numba(b, np.zeros_like(v), v, axis=2)
笔记
- 颠倒平坦索引对的顺序,以获得
,从而在没有运算的情况下缩短每秒的计算似乎不是一个好主意(可能是因为平坦数组不再以内存布局顺序遍历)。删除此部分并按升序使用平坦索引可使性能提高约30%(也适用于,但未包含在其中)偶数<奇数
# 1. example:
b = np.random.randint(0, 1000, size=(3, 10, 3))
v = np.random.randint(-9, 10, size=3) # Indexing into `b.shape[1]`.
result = sliced_reduce(b, np.zeros_like(v), v, np.add)
# 2. example:
b = np.random.randint(0, 1000, size=(98, 3, 300, 3))
v = np.random.randint(-299, 300, size=(32, 98, 3)) # Indexing into `b.shape[2]`; one additional leading dimension for `v`.
result = sliced_reduce(b, np.zeros_like(v), v, np.add, axis=2)
# 1. example:
b = np.random.randint(0, 1000, size=(3, 10, 3))
v = np.random.randint(-9, 10, size=3) # Indexing into `b.shape[1]`.
result = sliced_sum(b, np.zeros_like(v), v)
# 2. example:
b = np.random.randint(0, 1000, size=(98, 3, 300, 3))
v = np.random.randint(-299, 300, size=(32, 98, 3)) # Indexing into `b.shape[2]`; one additional leading dimension for `v`.
result = sliced_sum(b, np.zeros_like(v), v, axis=2)
# 1. example:
b = np.random.randint(0, 1000, size=(3, 10, 3))
v = np.random.randint(-9, 10, size=3) # Indexing into `b.shape[1]`.
result = sliced_sum_numba(b, np.zeros_like(v), v)
# 2. example:
b = np.random.randint(0, 1000, size=(98, 3, 300, 3))
v = np.random.randint(-299, 300, size=(32, 98, 3)) # Indexing into `b.shape[2]`; one additional leading dimension for `v`.
result = sliced_sum_numba(b, np.zeros_like(v), v, axis=2)
笔记
- 颠倒平坦索引对的顺序,以获得
,从而在没有运算的情况下缩短每秒的计算似乎不是一个好主意(可能是因为平坦数组不再以内存布局顺序遍历)。删除此部分并按升序使用平坦索引可使性能提高约30%(也适用于,但未包含在其中)偶数<奇数
In [25]: b.cumsum(axis=1)[np.arange(b.shape[0]), v-1]
Out[25]:
array([[3., 1., 4.],
[4., 2., 3.],
[3., 0., 1.]])
(还要注意,它不能正确地处理
v
中的0)就其价值而言,这里有一个单行程序。没有人承诺这是最有效的版本,因为它所做的添加远远超过了需要:
In [25]: b.cumsum(axis=1)[np.arange(b.shape[0]), v-1]
Out[25]:
array([[3., 1., 4.],
[4., 2., 3.],
[3., 0., 1.]])
(还要注意,它不能正确处理
v
中的0)以下函数允许对给定轴进行求和,并使用开始和停止数组指示的不同切片。它与一个适当计算的系数数组一起使用,该数组指示输入数组中的哪些元素应参与求和(使用系数1
和0
)。依靠einsum
使实现与其他包(如或)兼容(只需稍作更改)。由于每次加法运算都会对系数数组执行额外的乘法运算,因此它会将必要的计算次数增加一倍
from string import ascii_lowercase as symbols
import numpy as np
def sliced_sum(a, i, j, axis=None):
"""Sum an array along a given axis for varying slices `a[..., i:j, ...]` where `i` and `j` are arrays themselves.
Parameters
----------
a : array
The array to be summed over.
i : array
The start indices for the summation axis. Must have the same shape as `j`.
j : array
The stop indices for the summation axis. Must have the same shape as `i`.
axis : int, optional
Axis to be summed over. Defaults to `len(i.shape)`.
Returns
-------
array
Shape `i.shape + a.shape[axis+1:]`.
Notes
-----
The shapes of `a` and `i`, `j` must match up to the summation axis.
That means `a.shape[:axis] == i.shape[len(i.shape) - axis:]``.
`i` and `j` can have additional leading dimensions and `a` can have additional trailing dimensions.
"""
if axis is None:
axis = len(i.shape)
# Compute number of leading, common and trailing dimensions.
l = len(i.shape) - axis # Number of leading dimensions.
m = len(i.shape) - l # Number of common dimensions.
n = len(a.shape) - axis - 1 # Number of trailing dimensions.
# Select the corresponding symbols for `np.einsum`.
leading = symbols[:l]
common = symbols[l:l+m]
summation = symbols[l+m]
trailing = symbols[l+m+1:l+m+1+n]
# Convert negative indices.
i = (a.shape[axis] + i) % a.shape[axis]
j = (a.shape[axis] + j) % a.shape[axis]
# Compute the "active" elements, i.e. the ones that should participate in the summation.
# "active" elements have a coefficient of 1 (True), others are 0 (False).
indices, i, j = np.broadcast_arrays(np.arange(a.shape[axis]),
np.expand_dims(i, -1), np.expand_dims(j, -1))
active_elements = (i <= indices) & (indices < j)
return np.einsum(f'{leading + common + summation},{common + summation + trailing}->{leading + common + trailing}',
active_elements, a)
以下函数允许对给定轴进行求和,该轴具有由开始和停止数组指示的不同切片。它与一个适当计算的系数数组一起使用,该数组指示输入数组中的哪些元素应参与求和(使用系数
1
和0
)。依靠einsum
使实现与其他包(如或)兼容(只需稍作更改)。由于每次加法运算都会对系数数组执行额外的乘法运算,因此它会将必要的计算次数增加一倍
from string import ascii_lowercase as symbols
import numpy as np
def sliced_sum(a, i, j, axis=None):
"""Sum an array along a given axis for varying slices `a[..., i:j, ...]` where `i` and `j` are arrays themselves.
Parameters
----------
a : array
The array to be summed over.
i : array
The start indices for the summation axis. Must have the same shape as `j`.
j : array
The stop indices for the summation axis. Must have the same shape as `i`.
axis : int, optional
Axis to be summed over. Defaults to `len(i.shape)`.
Returns
-------
array
Shape `i.shape + a.shape[axis+1:]`.
Notes
-----
The shapes of `a` and `i`, `j` must match up to the summation axis.
That means `a.shape[:axis] == i.shape[len(i.shape) - axis:]``.
`i` and `j` can have additional leading dimensions and `a` can have additional trailing dimensions.
"""
if axis is None:
axis = len(i.shape)
# Compute number of leading, common and trailing dimensions.
l = len(i.shape) - axis # Number of leading dimensions.
m = len(i.shape) - l # Number of common dimensions.
n = len(a.shape) - axis - 1 # Number of trailing dimensions.
# Select the corresponding symbols for `np.einsum`.
leading = symbols[:l]
common = symbols[l:l+m]
summation = symbols[l+m]
trailing = symbols[l+m+1:l+m+1+n]
# Convert negative indices.
i = (a.shape[axis] + i) % a.shape[axis]
j = (a.shape[axis] + j) % a.shape[axis]
# Compute the "active" elements, i.e. the ones that should participate in the summation.
# "active" elements have a coefficient of 1 (True), others are 0 (False).
indices, i, j = np.broadcast_arrays(np.arange(a.shape[axis]),
np.expand_dims(i, -1), np.expand_dims(j, -1))
active_elements = (i <= indices) & (indices < j)
return np.einsum(f'{leading + common + summation},{common + summation + trailing}->{leading + common + trailing}',
active_elements, a)
另一个选择是使用加速循环。这避免了不必要的计算和内存分配,并与所有n完全兼容
def reduce_cumulative(a, i, j, ufunc, axis=None):
if axis is None:
axis = len(i.shape)
i = (a.shape[axis] + i) % a.shape[axis]
j = (a.shape[axis] + j) % a.shape[axis]
a = np.insert(a, 0, 0, axis) # Insert zeros to account for zero indices.
c = ufunc.accumulate(a, axis=axis)
pre = np.ix_(*(range(x) for x in i.shape)) # Indices for dimensions prior to `axis`.
l = len(i.shape) - axis # Number of leading dimensions in `i` and `j`.
return c[pre[l:] + (j,)] - c[pre[l:] + (i,)]