Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/arrays/12.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 每次使用不同的切片范围沿不同的dim对数组求和_Python_Arrays_Numpy_Indexing - Fatal编程技术网

Python 每次使用不同的切片范围沿不同的dim对数组求和

Python 每次使用不同的切片范围沿不同的dim对数组求和,python,arrays,numpy,indexing,Python,Arrays,Numpy,Indexing,假设我有一个形状为(3,10,3)的数组b,另一个形状为(3,10,3)的数组v=[8,9,4],见下文。对于b中的3个数组(10,3),我需要对由v确定的行数求和,即对于I=0,1,2我需要得到np.sum(b[I,0:v[I]],axis=0)。我的解决方案(如下所示)使用for循环,我想这是低效的。我想知道是否有一种有效的(矢量化的)方法来实现我上面所描述的 注意:我的实际数组有更多的维度,这些数组只是为了说明 v=np.数组([8,9,4]) b=np.数组([[0,1,0.]), [0

假设我有一个形状为
(3,10,3)
的数组
b
,另一个形状为
(3,10,3)
的数组
v=[8,9,4]
,见下文。对于
b
中的3个数组
(10,3)
,我需要对由
v
确定的行数求和,即对于
I=0,1,2
我需要得到
np.sum(b[I,0:v[I]],axis=0)
。我的解决方案(如下所示)使用for循环,我想这是低效的。我想知道是否有一种有效的(矢量化的)方法来实现我上面所描述的

注意:我的实际数组有更多的维度,这些数组只是为了说明

v=np.数组([8,9,4])
b=np.数组([[0,1,0.]),
[0., 0., 1.],
[0., 0., 1.],
[0., 0., 1.],
[1., 0., 0.],
[1., 0., 0.],
[0., 0., 1.],
[1., 0., 0.],
[0., 1., 0.],
[1., 0., 0.]],
[[0., 0., 1.],
[0., 1., 0.],
[1., 0., 0.],
[0., 0., 1.],
[1., 0., 0.],
[1., 0., 0.],
[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.],
[0., 1., 0.]],
[[1., 0., 0.],
[1., 0., 0.],
[1., 0., 0.],
[0., 0., 1.],
[0., 1., 0.],
[0., 1., 0.],
[1., 0., 0.],
[1., 0., 0.],
[0., 0., 1.],
[1., 0., 0.]]])
n=v.shape[0]
vv=np.零([n,n])
对于范围(n)中的i:
vv[i]=np.和(b[i,0:v[i]],轴=0)
输出:

vv
数组([[3,1,4.],
[4., 2., 3.],
[3., 0., 1.]])
编辑: 下面是数组v和b的一个实际示例

v=np.random.randint(0300,size=(32,98,3))
b=np.零([98,3300,3])
对于范围(3)中的i:
对于范围(98)内的j:
b[j,i]=np.随机多项式(1[1./3,1./3,1./3,1./3],300)
v、 形状
Out[292]:(32,98,3)
b、 形状
Out[293]:(98,3300,3)

我需要做与前面相同的事情,因此最终结果是一个shape
(32,98,3,3)
数组。请注意,我必须在每次迭代时执行上述操作,这就是我寻找高效实现的原因。

以下函数允许使用由开始和停止数组指示的不同切片来减少给定轴。它使用了引擎盖下以及适当重塑的输入数组和索引版本。它避免了不必要的计算,但分配了一个两倍于最终输出数组大小的中间数组(但丢弃值的计算不是ops)

对于OP中的示例,可按以下方式使用:

# 1. example:
b = np.random.randint(0, 1000, size=(3, 10, 3))
v = np.random.randint(-9, 10, size=3)  # Indexing into `b.shape[1]`.
result = sliced_reduce(b, np.zeros_like(v), v, np.add)

# 2. example:
b = np.random.randint(0, 1000, size=(98, 3, 300, 3))
v = np.random.randint(-299, 300, size=(32, 98, 3))  # Indexing into `b.shape[2]`; one additional leading dimension for `v`.
result = sliced_reduce(b, np.zeros_like(v), v, np.add, axis=2)
# 1. example:
b = np.random.randint(0, 1000, size=(3, 10, 3))
v = np.random.randint(-9, 10, size=3)  # Indexing into `b.shape[1]`.
result = sliced_sum(b, np.zeros_like(v), v)

# 2. example:
b = np.random.randint(0, 1000, size=(98, 3, 300, 3))
v = np.random.randint(-299, 300, size=(32, 98, 3))  # Indexing into `b.shape[2]`; one additional leading dimension for `v`.
result = sliced_sum(b, np.zeros_like(v), v, axis=2)
# 1. example:
b = np.random.randint(0, 1000, size=(3, 10, 3))
v = np.random.randint(-9, 10, size=3)  # Indexing into `b.shape[1]`.
result = sliced_sum_numba(b, np.zeros_like(v), v)

# 2. example:
b = np.random.randint(0, 1000, size=(98, 3, 300, 3))
v = np.random.randint(-299, 300, size=(32, 98, 3))  # Indexing into `b.shape[2]`; one additional leading dimension for `v`.
result = sliced_sum_numba(b, np.zeros_like(v), v, axis=2)
笔记
  • 颠倒平坦索引对的顺序,以获得
    偶数<奇数
    ,从而在没有运算的情况下缩短每秒的计算似乎不是一个好主意(可能是因为平坦数组不再以内存布局顺序遍历)。删除此部分并按升序使用平坦索引可使性能提高约30%(也适用于,但未包含在其中)

以下功能允许通过启动和停止阵列指示的不同切片减少给定轴。它使用了引擎盖下以及适当重塑的输入数组和索引版本。它避免了不必要的计算,但分配了一个两倍于最终输出数组大小的中间数组(但丢弃值的计算不是ops)

对于OP中的示例,可按以下方式使用:

# 1. example:
b = np.random.randint(0, 1000, size=(3, 10, 3))
v = np.random.randint(-9, 10, size=3)  # Indexing into `b.shape[1]`.
result = sliced_reduce(b, np.zeros_like(v), v, np.add)

# 2. example:
b = np.random.randint(0, 1000, size=(98, 3, 300, 3))
v = np.random.randint(-299, 300, size=(32, 98, 3))  # Indexing into `b.shape[2]`; one additional leading dimension for `v`.
result = sliced_reduce(b, np.zeros_like(v), v, np.add, axis=2)
# 1. example:
b = np.random.randint(0, 1000, size=(3, 10, 3))
v = np.random.randint(-9, 10, size=3)  # Indexing into `b.shape[1]`.
result = sliced_sum(b, np.zeros_like(v), v)

# 2. example:
b = np.random.randint(0, 1000, size=(98, 3, 300, 3))
v = np.random.randint(-299, 300, size=(32, 98, 3))  # Indexing into `b.shape[2]`; one additional leading dimension for `v`.
result = sliced_sum(b, np.zeros_like(v), v, axis=2)
# 1. example:
b = np.random.randint(0, 1000, size=(3, 10, 3))
v = np.random.randint(-9, 10, size=3)  # Indexing into `b.shape[1]`.
result = sliced_sum_numba(b, np.zeros_like(v), v)

# 2. example:
b = np.random.randint(0, 1000, size=(98, 3, 300, 3))
v = np.random.randint(-299, 300, size=(32, 98, 3))  # Indexing into `b.shape[2]`; one additional leading dimension for `v`.
result = sliced_sum_numba(b, np.zeros_like(v), v, axis=2)
笔记
  • 颠倒平坦索引对的顺序,以获得
    偶数<奇数
    ,从而在没有运算的情况下缩短每秒的计算似乎不是一个好主意(可能是因为平坦数组不再以内存布局顺序遍历)。删除此部分并按升序使用平坦索引可使性能提高约30%(也适用于,但未包含在其中)

不管它值多少钱,这里有一条单行线。没有人承诺这是最有效的版本,因为它所做的添加远远超过了需要:

In [25]: b.cumsum(axis=1)[np.arange(b.shape[0]), v-1]                                                          
Out[25]: 
array([[3., 1., 4.],
       [4., 2., 3.],
       [3., 0., 1.]])

(还要注意,它不能正确地处理
v
中的0)

就其价值而言,这里有一个单行程序。没有人承诺这是最有效的版本,因为它所做的添加远远超过了需要:

In [25]: b.cumsum(axis=1)[np.arange(b.shape[0]), v-1]                                                          
Out[25]: 
array([[3., 1., 4.],
       [4., 2., 3.],
       [3., 0., 1.]])

(还要注意,它不能正确处理
v
中的0)

以下函数允许对给定轴进行求和,并使用开始和停止数组指示的不同切片。它与一个适当计算的系数数组一起使用,该数组指示输入数组中的哪些元素应参与求和(使用系数
1
0
)。依靠
einsum
使实现与其他包(如或)兼容(只需稍作更改)。由于每次加法运算都会对系数数组执行额外的乘法运算,因此它会将必要的计算次数增加一倍

from string import ascii_lowercase as symbols
import numpy as np

def sliced_sum(a, i, j, axis=None):
    """Sum an array along a given axis for varying slices `a[..., i:j, ...]` where `i` and `j` are arrays themselves.

    Parameters
    ----------
    a : array
        The array to be summed over.
    i : array
        The start indices for the summation axis. Must have the same shape as `j`.
    j : array
        The stop indices for the summation axis. Must have the same shape as `i`.
    axis : int, optional
        Axis to be summed over. Defaults to `len(i.shape)`.

    Returns
    -------
    array
        Shape `i.shape + a.shape[axis+1:]`.

    Notes
    -----
    The shapes of `a` and `i`, `j` must match up to the summation axis.
    That means `a.shape[:axis] == i.shape[len(i.shape) - axis:]``. 
    `i` and `j` can have additional leading dimensions and `a` can have additional trailing dimensions.
    """
    if axis is None:
        axis = len(i.shape)

    # Compute number of leading, common and trailing dimensions.
    l = len(i.shape) - axis      # Number of leading dimensions.
    m = len(i.shape) - l         # Number of common dimensions.
    n = len(a.shape) - axis - 1  # Number of trailing dimensions.

    # Select the corresponding symbols for `np.einsum`.
    leading = symbols[:l]
    common = symbols[l:l+m]
    summation = symbols[l+m]
    trailing = symbols[l+m+1:l+m+1+n]

    # Convert negative indices.
    i = (a.shape[axis] + i) % a.shape[axis]
    j = (a.shape[axis] + j) % a.shape[axis]

    # Compute the "active" elements, i.e. the ones that should participate in the summation.
    # "active" elements have a coefficient of 1 (True), others are 0 (False).
    indices, i, j = np.broadcast_arrays(np.arange(a.shape[axis]),
                                        np.expand_dims(i, -1), np.expand_dims(j, -1))
    active_elements = (i <= indices) & (indices < j)
    return np.einsum(f'{leading + common + summation},{common + summation + trailing}->{leading + common + trailing}',
                     active_elements, a)

以下函数允许对给定轴进行求和,该轴具有由开始和停止数组指示的不同切片。它与一个适当计算的系数数组一起使用,该数组指示输入数组中的哪些元素应参与求和(使用系数
1
0
)。依靠
einsum
使实现与其他包(如或)兼容(只需稍作更改)。由于每次加法运算都会对系数数组执行额外的乘法运算,因此它会将必要的计算次数增加一倍

from string import ascii_lowercase as symbols
import numpy as np

def sliced_sum(a, i, j, axis=None):
    """Sum an array along a given axis for varying slices `a[..., i:j, ...]` where `i` and `j` are arrays themselves.

    Parameters
    ----------
    a : array
        The array to be summed over.
    i : array
        The start indices for the summation axis. Must have the same shape as `j`.
    j : array
        The stop indices for the summation axis. Must have the same shape as `i`.
    axis : int, optional
        Axis to be summed over. Defaults to `len(i.shape)`.

    Returns
    -------
    array
        Shape `i.shape + a.shape[axis+1:]`.

    Notes
    -----
    The shapes of `a` and `i`, `j` must match up to the summation axis.
    That means `a.shape[:axis] == i.shape[len(i.shape) - axis:]``. 
    `i` and `j` can have additional leading dimensions and `a` can have additional trailing dimensions.
    """
    if axis is None:
        axis = len(i.shape)

    # Compute number of leading, common and trailing dimensions.
    l = len(i.shape) - axis      # Number of leading dimensions.
    m = len(i.shape) - l         # Number of common dimensions.
    n = len(a.shape) - axis - 1  # Number of trailing dimensions.

    # Select the corresponding symbols for `np.einsum`.
    leading = symbols[:l]
    common = symbols[l:l+m]
    summation = symbols[l+m]
    trailing = symbols[l+m+1:l+m+1+n]

    # Convert negative indices.
    i = (a.shape[axis] + i) % a.shape[axis]
    j = (a.shape[axis] + j) % a.shape[axis]

    # Compute the "active" elements, i.e. the ones that should participate in the summation.
    # "active" elements have a coefficient of 1 (True), others are 0 (False).
    indices, i, j = np.broadcast_arrays(np.arange(a.shape[axis]),
                                        np.expand_dims(i, -1), np.expand_dims(j, -1))
    active_elements = (i <= indices) & (indices < j)
    return np.einsum(f'{leading + common + summation},{common + summation + trailing}->{leading + common + trailing}',
                     active_elements, a)
另一个选择是使用加速循环。这避免了不必要的计算和内存分配,并与所有
n完全兼容
def reduce_cumulative(a, i, j, ufunc, axis=None):
    if axis is None:
        axis = len(i.shape)
    i = (a.shape[axis] + i) % a.shape[axis]
    j = (a.shape[axis] + j) % a.shape[axis]
    a = np.insert(a, 0, 0, axis)  # Insert zeros to account for zero indices.
    c = ufunc.accumulate(a, axis=axis)
    pre = np.ix_(*(range(x) for x in i.shape))  # Indices for dimensions prior to `axis`.
    l = len(i.shape) - axis  # Number of leading dimensions in `i` and `j`.
    return c[pre[l:] + (j,)] - c[pre[l:] + (i,)]