Python 在numpy数组中每n次删除一系列元素

Python 在numpy数组中每n次删除一系列元素,python,arrays,numpy,Python,Arrays,Numpy,我知道如何删除numpy数组中的每4个元素: frame = np.delete(frame,np.arange(4,frame.size,4)) 现在我想知道是否有一个简单的命令可以每n次(例如4次)删除3个值 一个基本的例子: 输入:[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20….] 将导致: 输出:[1,2,3,7,8,9,13,14,15,19,20,…] 我希望有一个简单的numpy/python功能,而不是编写一个必须在向量

我知道如何删除numpy数组中的每4个元素:

frame = np.delete(frame,np.arange(4,frame.size,4))
现在我想知道是否有一个简单的命令可以每n次(例如4次)删除3个值

一个基本的例子:

输入:[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20….]

将导致:

输出:[1,2,3,7,8,9,13,14,15,19,20,…]

我希望有一个简单的numpy/python功能,而不是编写一个必须在向量上迭代的函数(因为在我的例子中它相当长,…)

感谢您的帮助

方法#1:这里有一种方法使用
布尔索引
-

a[np.mod(np.arange(a.size),6)<3]
样本运行-

In [545]: a = np.arange(1,21)

In [546]: a
Out[546]: 
array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20])

In [547]: select_in_groups_strided(a,3,3)
Out[547]: array([ 1,  2,  3,  7,  8,  9, 13, 14, 15, 19, 20])

In [548]: a = np.arange(1,25)

In [549]: a
Out[549]: 
array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20, 21, 22, 23, 24])

In [550]: select_in_groups_strided(a,3,3)
Out[550]: array([ 1,  2,  3,  7,  8,  9, 13, 14, 15, 19, 20, 21])
运行时测试

使用与中相同的设置-


stridded
如果您考虑性能的话,您可以在不同的大小之间进行很好的缩放。

使用布尔索引的方法:

def block_delete(a, n, m):  #keep n, remove m
    mask = np.tile(np.r_[np.ones(n), np.zeros(m)].astype(bool), a.size // (n + m) + 1)[:a.size]
    return a[mask]
与@Divakar相比:

def mod_delete(a, n, m):
    return a[np.mod(np.arange(a.size), n + m) < n]

a = np.arange(19) + 1

%timeit block_delete(a, 3, 4)
10000 loops, best of 3: 50.6 µs per loop

%timeit mod_delete(a, 3, 4)
The slowest run took 9.37 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 5.69 µs per loop
再长一点:

a = np.arange(999999) + 1

%timeit block_delete(a, 3, 4)
100 loops, best of 3: 3.93 ms per loop

%timeit mod_delete(a, 3, 4)
100 loops, best of 3: 12.3 ms per loop

因此,哪个更快将取决于阵列的大小

谢谢,这是可行的,您能解释一下该命令的作用吗?我真的不知道“6”是什么意思,是的,我想当你大步回答时会有一个
,只是我的头绕不过去。
In [637]: a = np.arange(1,21)

In [638]: %timeit block_delete(a,3,3)
10000 loops, best of 3: 21 µs per loop

In [639]: %timeit select_in_groups_strided(a,3,3)
100000 loops, best of 3: 6.44 µs per loop

In [640]: a = np.arange(1,2100)

In [641]: %timeit block_delete(a,3,3)
10000 loops, best of 3: 27 µs per loop

In [642]: %timeit select_in_groups_strided(a,3,3)
100000 loops, best of 3: 9.1 µs per loop

In [643]: a = np.arange(999999) + 1

In [644]: %timeit block_delete(a,3,3)
100 loops, best of 3: 2.24 ms per loop

In [645]: %timeit select_in_groups_strided(a,3,3)
1000 loops, best of 3: 1.12 ms per loop
def block_delete(a, n, m):  #keep n, remove m
    mask = np.tile(np.r_[np.ones(n), np.zeros(m)].astype(bool), a.size // (n + m) + 1)[:a.size]
    return a[mask]
def mod_delete(a, n, m):
    return a[np.mod(np.arange(a.size), n + m) < n]

a = np.arange(19) + 1

%timeit block_delete(a, 3, 4)
10000 loops, best of 3: 50.6 µs per loop

%timeit mod_delete(a, 3, 4)
The slowest run took 9.37 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 5.69 µs per loop
a = np.arange(999) + 1

%timeit block_delete(a, 3, 4)
The slowest run took 4.61 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 54.8 µs per loop

%timeit mod_delete(a, 3, 4)
The slowest run took 5.13 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 14.5 µs per loop
a = np.arange(999999) + 1

%timeit block_delete(a, 3, 4)
100 loops, best of 3: 3.93 ms per loop

%timeit mod_delete(a, 3, 4)
100 loops, best of 3: 12.3 ms per loop