Python 一次从numpy数组中选择多个切片_Python_Numpy_Slice

Python 一次从numpy数组中选择多个切片

python numpy

Python 一次从numpy数组中选择多个切片,python,numpy,slice,Python,Numpy,Slice,我正在寻找一种方法，一次从numpy数组中选择多个切片。假设我们有一个1D数据数组，并希望提取其中的三部分，如下所示： data_extractions = [] for start_index in range(0, 3): data_extractions.append(data[start_index: start_index + 5]) 之后，数据提取将是： data_extractions = [ data[0:5], data[1:6], data

我正在寻找一种方法，一次从numpy数组中选择多个切片。假设我们有一个1D数据数组，并希望提取其中的三部分，如下所示：

data_extractions = []

for start_index in range(0, 3):
    data_extractions.append(data[start_index: start_index + 5])

之后，数据提取将是：

data_extractions = [
    data[0:5],
    data[1:6],
    data[2:7]
]

有没有办法在没有for循环的情况下执行上述操作？numpy中的某种索引方案，可以让我从一个数组中选择多个切片，并将它们作为那么多数组返回，比如在n+1维数组中

我想也许我可以复制我的数据，然后从每一行中选择一个跨度，但下面的代码会抛出一个索引器

replicated_data = np.vstack([data] * 3)
data_extractions = replicated_data[[range(3)], [slice(0, 5), slice(1, 6), slice(2, 7)]

stride\u技巧

可以做到这一点

a = np.arange(10)
b = np.lib.stride_tricks.as_strided(a, (3, 5), 2 * a.strides)
b
# array([[0, 1, 2, 3, 4],
#        [1, 2, 3, 4, 5],
#        [2, 3, 4, 5, 6]])

请注意，

引用与

相同的内存，事实上多次引用（例如

b[0,1]

和

b[1,0]

是相同的内存地址）。因此，在使用新结构之前制作副本是最安全的

nd可以以类似的方式完成，例如2d->4d

a = np.arange(16).reshape(4, 4)
b = np.lib.stride_tricks.as_strided(a, (3,3,2,2), 2*a.strides)
b.reshape(9,2,2) # this forces a copy
# array([[[ 0,  1],
#         [ 4,  5]],

#        [[ 1,  2],
#         [ 5,  6]],

#        [[ 2,  3],
#         [ 6,  7]],

#        [[ 4,  5],
#         [ 8,  9]],

#        [[ 5,  6],
#         [ 9, 10]],

#        [[ 6,  7],
#         [10, 11]],

#        [[ 8,  9],
#         [12, 13]],

#        [[ 9, 10],
#         [13, 14]],

#        [[10, 11],
#         [14, 15]]])

stride\u技巧

可以做到这一点

a = np.arange(10)
b = np.lib.stride_tricks.as_strided(a, (3, 5), 2 * a.strides)
b
# array([[0, 1, 2, 3, 4],
#        [1, 2, 3, 4, 5],
#        [2, 3, 4, 5, 6]])

请注意，

引用与

相同的内存，事实上多次引用（例如

b[0,1]

和

b[1,0]

是相同的内存地址）。因此，在使用新结构之前制作副本是最安全的

nd可以以类似的方式完成，例如2d->4d

a = np.arange(16).reshape(4, 4)
b = np.lib.stride_tricks.as_strided(a, (3,3,2,2), 2*a.strides)
b.reshape(9,2,2) # this forces a copy
# array([[[ 0,  1],
#         [ 4,  5]],

#        [[ 1,  2],
#         [ 5,  6]],

#        [[ 2,  3],
#         [ 6,  7]],

#        [[ 4,  5],
#         [ 8,  9]],

#        [[ 5,  6],
#         [ 9, 10]],

#        [[ 6,  7],
#         [10, 11]],

#        [[ 8,  9],
#         [12, 13]],

#        [[ 9, 10],
#         [13, 14]],

#        [[10, 11],
#         [14, 15]]])

您可以使用索引选择要放入适当形状的行。例如：

 data = np.random.normal(size=(100,2,2,2))

 # Creating an array of row-indexes
 indexes = np.array([np.arange(0,5), np.arange(1,6), np.arange(2,7)])
 # data[indexes] will return an element of shape (3,5,2,2,2). Converting
 # to list happens along axis 0
 data_extractions = list(data[indexes])

 np.all(data_extractions[1] == data[1:6])
 True

最后的对比是与原始数据的对比

您可以使用索引来选择要放入适当形状的行。例如：

 data = np.random.normal(size=(100,2,2,2))

 # Creating an array of row-indexes
 indexes = np.array([np.arange(0,5), np.arange(1,6), np.arange(2,7)])
 # data[indexes] will return an element of shape (3,5,2,2,2). Converting
 # to list happens along axis 0
 data_extractions = list(data[indexes])

 np.all(data_extractions[1] == data[1:6])
 True

最后的对比是与原始数据的对比

在这篇文章中，是一种使用

跨步索引方案的方法，它基本上是在输入数组中创建一个视图，因此创建视图非常有效，并且作为一个视图不会占用更多的内存空间。
此外，这也适用于具有通用维数的nArray
下面是实现-
def strided_axis0(a, L):
    # Store the shape and strides info
    shp = a.shape
    s  = a.strides

    # Compute length of output array along the first axis
    nd0 = shp[0]-L+1

    # Setup shape and strides for use with np.lib.stride_tricks.as_strided
    # and get (n+1) dim output array
    shp_in = (nd0,L)+shp[1:]
    strd_in = (s[0],) + s
    return np.lib.stride_tricks.as_strided(a, shape=shp_in, strides=strd_in)

4D
阵列案例的示例运行-
In [44]: a = np.random.randint(11,99,(10,4,2,3)) # Array

In [45]: L = 5      # Window length along the first axis

In [46]: out = strided_axis0(a, L)

In [47]: np.allclose(a[0:L], out[0])  # Verify outputs
Out[47]: True

In [48]: np.allclose(a[1:L+1], out[1])
Out[48]: True

In [49]: np.allclose(a[2:L+2], out[2])
Out[49]: True

在这篇文章中，有一种方法使用了跨步索引方案，它基本上是在输入数组中创建一个视图，因此创建视图非常有效，并且作为一个视图不会占用更多的内存空间。
此外，这也适用于具有通用维数的nArray
下面是实现-
def strided_axis0(a, L):
    # Store the shape and strides info
    shp = a.shape
    s  = a.strides

    # Compute length of output array along the first axis
    nd0 = shp[0]-L+1

    # Setup shape and strides for use with np.lib.stride_tricks.as_strided
    # and get (n+1) dim output array
    shp_in = (nd0,L)+shp[1:]
    strd_in = (s[0],) + s
    return np.lib.stride_tricks.as_strided(a, shape=shp_in, strides=strd_in)

4D
阵列案例的示例运行-
In [44]: a = np.random.randint(11,99,(10,4,2,3)) # Array

In [45]: L = 5      # Window length along the first axis

In [46]: out = strided_axis0(a, L)

In [47]: np.allclose(a[0:L], out[0])  # Verify outputs
Out[47]: True

In [48]: np.allclose(a[1:L+1], out[1])
Out[48]: True

In [49]: np.allclose(a[2:L+2], out[2])
Out[49]: True

对此，我们可以使用列表理解
data=np.array([1,2,3,4,5,6,7,8,9,10])
data_extractions=[data[b:b+5] for b in [1,2,3,4,5]]
data_extractions

结果
[array([2, 3, 4, 5, 6]), array([3, 4, 5, 6, 7]), array([4, 5, 6, 7, 8]), array([5, 6, 7, 8, 9]), array([ 6,  7,  8,  9, 10])]

对此，我们可以使用列表理解
data=np.array([1,2,3,4,5,6,7,8,9,10])
data_extractions=[data[b:b+5] for b in [1,2,3,4,5]]
data_extractions

结果
[array([2, 3, 4, 5, 6]), array([3, 4, 5, 6, 7]), array([4, 5, 6, 7, 8]), array([5, 6, 7, 8, 9]), array([ 6,  7,  8,  9, 10])]

可以使用准备好的切片阵列切片阵列
a = np.array(list('abcdefg'))

b = np.array([
        [0, 1, 2, 3, 4],
        [1, 2, 3, 4, 5],
        [2, 3, 4, 5, 6]
    ])

a[b]

但是，b
不必以这种方式手动生成。它可以更具动态性
b = np.arange(5) + np.arange(3)[:, None]

可以使用准备好的切片阵列切片阵列
a = np.array(list('abcdefg'))

b = np.array([
        [0, 1, 2, 3, 4],
        [1, 2, 3, 4, 5],
        [2, 3, 4, 5, 6]
    ])

a[b]

但是，b
不必以这种方式手动生成。它可以更具动态性
b = np.arange(5) + np.arange(3)[:, None]

在一般情况下，您必须在构造索引或收集结果时进行某种迭代和连接。只有当切片模式本身是规则的时，您才可以通过使用广义切片
接受的答案构造一个索引数组，每个切片一行。这就是在切片上迭代，而arange
本身就是一个（快速）迭代。而np.array
将它们连接在一个新的轴上（np.stack
概括了这一点）
索引\u技巧
方便的方法来做同样的事情：
In [265]: np.r_[0:5, 1:6, 2:7]
Out[265]: array([0, 1, 2, 3, 4, 1, 2, 3, 4, 5, 2, 3, 4, 5, 6])

它采用切片表示法，用arange
展开并连接。它甚至可以让我扩展和连接到2d
In [269]: np.r_['0,2',0:5, 1:6, 2:7]
Out[269]: 
array([[0, 1, 2, 3, 4],
       [1, 2, 3, 4, 5],
       [2, 3, 4, 5, 6]])

In [270]: data=np.array(list('abcdefghijk'))
In [272]: data[np.r_['0,2',0:5, 1:6, 2:7]]
Out[272]: 
array([['a', 'b', 'c', 'd', 'e'],
       ['b', 'c', 'd', 'e', 'f'],
       ['c', 'd', 'e', 'f', 'g']], 
      dtype='<U1')
In [273]: data[np.r_[0:5, 1:6, 2:7]]
Out[273]: 
array(['a', 'b', 'c', 'd', 'e', 'b', 'c', 'd', 'e', 'f', 'c', 'd', 'e',
       'f', 'g'], 
      dtype='<U1')

我对其他SO问题的记忆是，相对时间在同一数量级。例如，它可能随切片的数量和长度而变化。总的来说，必须从源复制到目标的值的数量将是相同的
如果切片长度不同，则必须使用平面索引。
在一般情况下，在构造索引或收集结果时，必须进行某种迭代和连接。只有当切片模式本身是规则的时，您才可以通过使用广义切片
接受的答案构造一个索引数组，每个切片一行。这就是在切片上迭代，而arange
本身就是一个（快速）迭代。而np.array
将它们连接在一个新的轴上（np.stack
概括了这一点）
索引\u技巧
方便的方法来做同样的事情：
In [265]: np.r_[0:5, 1:6, 2:7]
Out[265]: array([0, 1, 2, 3, 4, 1, 2, 3, 4, 5, 2, 3, 4, 5, 6])

它采用切片表示法，用arange
展开并连接。它甚至可以让我扩展和连接到2d
In [269]: np.r_['0,2',0:5, 1:6, 2:7]
Out[269]: 
array([[0, 1, 2, 3, 4],
       [1, 2, 3, 4, 5],
       [2, 3, 4, 5, 6]])

In [270]: data=np.array(list('abcdefghijk'))
In [272]: data[np.r_['0,2',0:5, 1:6, 2:7]]
Out[272]: 
array([['a', 'b', 'c', 'd', 'e'],
       ['b', 'c', 'd', 'e', 'f'],
       ['c', 'd', 'e', 'f', 'g']], 
      dtype='<U1')
In [273]: data[np.r_[0:5, 1:6, 2:7]]
Out[273]: 
array(['a', 'b', 'c', 'd', 'e', 'b', 'c', 'd', 'e', 'f', 'c', 'd', 'e',
       'f', 'g'], 
      dtype='<U1')

我对其他SO问题的记忆是，相对时间在同一数量级。例如，它可能随切片的数量和长度而变化。总的来说，必须从源复制到目标的值的数量将是相同的
如果切片长度不同，则必须使用平面索引。
那里有什么way@Divakar-尺寸。为了简单起见，我给出了一个1D示例，但需要一个通用的解决方案（我真正的问题是4D）。那里有什么way@Divakar-尺寸。为了简单起见，我给出了一个1D示例，但需要一个通用的解决方案（我真正的问题是4D）我同意：）但不是本机for循环：）这不会避免for循环；）我同意：）但不是循环的本地人：）很好，我不知道np.lib.stride的技巧。当我跨步时，谢谢你，保罗。@Puchatek很高兴能帮上忙。小心那些东西。据我所知，它不会检查范围，所以它会很高兴地允许您访问范围外的内存等。是的，在Ipython中玩弄了它，并意识到它很快会在