Python 组合两个切片操作

Python 组合两个切片操作,python,numpy,Python,Numpy,是否有一种智能且简单的方法将两个切片操作合并为一个 比如说我有 arange(1000)[::2][10:20] >>> array([20, 22, 24, 26, 28, 30, 32, 34, 36, 38]) 当然,在本例中这不是问题,但是如果数组非常大,我非常希望避免创建中间数组(或者没有中间数组?)。我相信这两个部分应该可以结合起来,但也许我在监督一些事情。 所以这个想法是这样的: arange(1000)[ slice(None,None,2) + slice(

是否有一种智能且简单的方法将两个切片操作合并为一个

比如说我有

arange(1000)[::2][10:20]
>>> array([20, 22, 24, 26, 28, 30, 32, 34, 36, 38])
当然,在本例中这不是问题,但是如果数组非常大,我非常希望避免创建中间数组(或者没有中间数组?)。我相信这两个部分应该可以结合起来,但也许我在监督一些事情。 所以这个想法是这样的:

arange(1000)[ slice(None,None,2) + slice(10,20,None) ]
这当然不起作用,但这正是我想做的。是否有任何东西可以组合切片对象?(尽管我努力了,但什么也没找到。)

非常简单:

arange(1000)[20:40:2]

应该这样做

您可以使用
islice
,它可能不会更快,但会通过充当生成器来避免中间条目:

arange = range(1000)

from itertools import islice
islice(islice(arange, None, None, 2), 10, 20)

%timeit list(islice(islice(arange, None, None, 2), 10, 20))
100000 loops, best of 3: 2 us per loop

%timeit arange[::2][10:20]
100000 loops, best of 3: 2.64 us per loop
所以,快一点

  • 您可以子类化
    slice
    ,使切片的这种叠加成为可能。只需覆盖
    \uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu。但它将调用一些数学。顺便说一下,您可以用这些东西制作一个不错的Python包;-)
    
  • 同样,切片在NumPy中不需要任何成本。因此,您可以继续使用一个简单的解决方案,如切片列表
  • p.S.一般来说,可以使用多个切片来使代码更漂亮、更清晰。即使是在以下几行中的一行中进行简单的选择:

    v = A[::2][10:20]
    v = A[20:40][::2]
    v = A[20:40:2]
    
    能够深刻地反映程序逻辑,使代码自文档化

    再举一个例子:如果您有一个扁平的NumPy数组,并且希望在长度
    length
    的位置
    position
    提取一个子数组,您可以这样做

    v = A[position : position + length]
    


    自己决定哪个选项看起来更好。;-)

    正如@Tigran所说,使用Numpy阵列时,切片成本不高。但是,一般来说,我们可以使用来自的信息将两个切片串联在一起

    从假定长度序列的切片对象切片中检索开始、停止和步长索引

    我们可以减少

    x[slice1][slice2]
    

    第一个切片返回一个新对象,然后由第二个切片进行切片。因此,我们还需要数据对象的长度来正确组合切片。(第一维中的长度)

    所以,我们可以写

    def slice_combine(slice1, slice2, length):
        """
        returns a slice that is a combination of the two slices.
        As in 
          x[slice1][slice2]
        becomes
          combined_slice = slice_combine(slice1, slice2, len(x))
          x[combined_slice]
    
        :param slice1: The first slice
        :param slice2: The second slice
        :param length: The length of the first dimension of data being sliced. (eg len(x))
        """
    
        # First get the step sizes of the two slices.
        slice1_step = (slice1.step if slice1.step is not None else 1)
        slice2_step = (slice2.step if slice2.step is not None else 1)
    
        # The final step size
        step = slice1_step * slice2_step
    
        # Use slice1.indices to get the actual indices returned from slicing with slice1
        slice1_indices = slice1.indices(length)
    
        # We calculate the length of the first slice
        slice1_length = (abs(slice1_indices[1] - slice1_indices[0]) - 1) // abs(slice1_indices[2])
    
        # If we step in the same direction as the start,stop, we get at least one datapoint
        if (slice1_indices[1] - slice1_indices[0]) * slice1_step > 0:
            slice1_length += 1
        else:
            # Otherwise, The slice is zero length.
            return slice(0,0,step)
    
        # Use the length after the first slice to get the indices returned from a
        # second slice starting at 0.
        slice2_indices = slice2.indices(slice1_length)
    
        # if the final range length = 0, return
        if not (slice2_indices[1] - slice2_indices[0]) * slice2_step > 0:
            return slice(0,0,step)
    
        # We shift slice2_indices by the starting index in slice1 and the 
        # step size of slice1
        start = slice1_indices[0] + slice2_indices[0] * slice1_step
        stop = slice1_indices[0] + slice2_indices[1] * slice1_step
    
        # slice.indices will return -1 as the stop index when slice.stop should be set to None.
        if start > stop:
            if stop < 0:
                stop = None
    
        return slice(start, stop, step)
    
    谢天谢地,我们得到了

    所有15059072项测试均通过


    我认为OP正在寻找一种更通用的方法,不需要手动更新值。非常感谢您指出这一点,但实际上我正在寻找一种通用的解决方案。我知道这是我的错,因为我没有明确地说明这一点。你必须记住,在切片numpy阵列时,你没有复制任何数据,只是更改了内存中数据的视图。切片是
    O(1)
    复杂性,所以创建中间数组不是什么大问题。这是一个很好的观点,所以我想这就是为什么没有类似的内置功能。但是,我希望将应该制作的切片保存到矩阵中,以获得我想要的对象中的数据。当然,可以将所有切片保存在一个列表中,但是感觉应该有一种方法可以以一种智能的方式将它们组合起来,然后保存。如果您想这样做,我建议将切片存储为布尔数组,然后您可以将它们与逻辑运算符组合,并将结果用作“切片”。不幸的是,如果您想要处理多个大小的数组,这将不是很灵活。如果你用一个函数来实现这一点,你可以将切片作为
    *args
    传递,并让该函数处理创建正确布尔数组所需的所有计算。同样,只对非numpy有意义。
    x[combined]
    
    def slice_combine(slice1, slice2, length):
        """
        returns a slice that is a combination of the two slices.
        As in 
          x[slice1][slice2]
        becomes
          combined_slice = slice_combine(slice1, slice2, len(x))
          x[combined_slice]
    
        :param slice1: The first slice
        :param slice2: The second slice
        :param length: The length of the first dimension of data being sliced. (eg len(x))
        """
    
        # First get the step sizes of the two slices.
        slice1_step = (slice1.step if slice1.step is not None else 1)
        slice2_step = (slice2.step if slice2.step is not None else 1)
    
        # The final step size
        step = slice1_step * slice2_step
    
        # Use slice1.indices to get the actual indices returned from slicing with slice1
        slice1_indices = slice1.indices(length)
    
        # We calculate the length of the first slice
        slice1_length = (abs(slice1_indices[1] - slice1_indices[0]) - 1) // abs(slice1_indices[2])
    
        # If we step in the same direction as the start,stop, we get at least one datapoint
        if (slice1_indices[1] - slice1_indices[0]) * slice1_step > 0:
            slice1_length += 1
        else:
            # Otherwise, The slice is zero length.
            return slice(0,0,step)
    
        # Use the length after the first slice to get the indices returned from a
        # second slice starting at 0.
        slice2_indices = slice2.indices(slice1_length)
    
        # if the final range length = 0, return
        if not (slice2_indices[1] - slice2_indices[0]) * slice2_step > 0:
            return slice(0,0,step)
    
        # We shift slice2_indices by the starting index in slice1 and the 
        # step size of slice1
        start = slice1_indices[0] + slice2_indices[0] * slice1_step
        stop = slice1_indices[0] + slice2_indices[1] * slice1_step
    
        # slice.indices will return -1 as the stop index when slice.stop should be set to None.
        if start > stop:
            if stop < 0:
                stop = None
    
        return slice(start, stop, step)
    
    import sys
    import numpy as np
    
    # Make a 1D dataset
    x = np.arange(100)
    l = len(x)
    
    # Make a (100, 10) dataset
    x2 = np.arange(1000)
    x2 = x2.reshape((100,10))
    l2 = len(x2)
    
    # Test indices and steps
    indices = [None, -1000, -100, -99, -50, -10, -1, 0, 1, 10, 50, 99, 100, 1000]
    steps = [-1000, -99, -50, -10, -3, -2, -1, 1, 2, 3, 10, 50, 99, 1000]
    indices_l = len(indices)
    steps_l = len(steps)
    
    count = 0
    total = 2 * indices_l**4 * steps_l**2
    for i in range(indices_l):
        for j in range(indices_l):
            for k in range(steps_l):
                for q in range(indices_l):
                    for r in range(indices_l):
                        for s in range(steps_l):
                            # Print the progress. There are a lot of combinations.
                            if count % 5197 == 0:
                                sys.stdout.write("\rPROGRESS: {0:,}/{1:,} ({2:.0f}%)".format(count, total, float(count) / float(total) * 100))
                                sys.stdout.flush()
    
                            slice1 = slice(indices[i], indices[j], steps[k])
                            slice2 = slice(indices[q], indices[r], steps[s])
    
                            combined = slice_combine(slice1, slice2, l)
                            combined2 = slice_combine(slice1, slice2, l2)
                            np.testing.assert_array_equal(x[slice1][slice2], x[combined], 
                                err_msg="For 1D, slice1: {0},\tslice2: {1},\tcombined: {2}\tCOUNT: {3}".format(slice1, slice2, combined, count))
                            np.testing.assert_array_equal(x2[slice1][slice2], x2[combined2], 
                                err_msg="For 2D, slice1: {0},\tslice2: {1},\tcombined: {2}\tCOUNT: {3}".format(slice1, slice2, combined2, count))
    
                            # 2 tests per loop
                            count += 2
    
    print("\n-----------------")
    print("All {0:,} tests passed!".format(count))