Python 根据间隙大小更改numpy数组中的间隙

Python 根据间隙大小更改numpy数组中的间隙,python,numpy,conditional-statements,sequence,Python,Numpy,Conditional Statements,Sequence,我需要过滤掉短的非零序列,它位于零之间。例如,此阵列: t = np.array([1, 3, 1, 0, 0, 1, 8, 3, 0, 8, 2, 4, 7, 0,0,4,1]) 应成为: array([1, 3, 1, 0, 0, 0, 0, 0, 0, 8, 2, 4, 7, 0, 0, 4, 1]) 我找到了非零序列的第一个索引,并计算了它们之间的非零数量。我写了下面的内容,它很有效,但看起来很糟糕。我试过staf,但有一个错误。 如何用pythonicly重写它 minseq =

我需要过滤掉短的非零序列,它位于零之间。例如,此阵列:

t = np.array([1, 3, 1, 0, 0, 1, 8, 3, 0, 8, 2, 4, 7, 0,0,4,1])
应成为:

array([1, 3, 1, 0, 0, 0, 0, 0, 0, 8, 2, 4, 7, 0, 0, 4, 1])
我找到了非零序列的第一个索引,并计算了它们之间的非零数量。我写了下面的内容,它很有效,但看起来很糟糕。我试过staf,但有一个错误。 如何用pythonicly重写它

minseq = 4  # length of minimal non zero seq
p = np.where(fhr>0, 1, 0).astype(int)
s = np.array([1]+ list(np.diff(p)))
sind = np.where(s==1)[0][1:]
print(sind)
    
for i in range(len(sind) - 1):
    s1 = sind[i]
    e1 = sind[i+1]
    
    subfhr = np.where(fhr[s1:e1] > 0, 1, 0).sum()
    
    if (subfhr < minseq):
        
        print(s1, e1, subfhr)
        fhr[s1:e1] = 0

您可以使用基于图像的图像处理-

样本运行-

In [97]: a
Out[97]: array([1, 3, 1, 0, 0, 1, 8, 3, 0, 8, 2, 4, 7, 0, 0, 4, 1])

In [98]: remove_small_nnz(a, W=3)
Out[98]: array([1, 3, 1, 0, 0, 1, 8, 3, 0, 8, 2, 4, 7, 0, 0, 4, 1])

In [99]: remove_small_nnz(a, W=4)
Out[99]: array([1, 3, 1, 0, 0, 0, 0, 0, 0, 8, 2, 4, 7, 0, 0, 4, 1])

In [100]: remove_small_nnz(a, W=5)
Out[100]: array([1, 3, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 1])

由于您只查找非零,因此可以将数组强制转换为布尔值,并查找一行中有多少个实数序列的点

import numpy as np

def orig(fhr, minseq):
    p = np.where(fhr>0, 1, 0).astype(int)
    s = np.array([1]+ list(np.diff(p)))
    sind = np.where(s==1)[0][1:]
    for i in range(len(sind) - 1):
        s1 = sind[i]
        e1 = sind[i+1]
        subfhr = np.where(fhr[s1:e1] > 0, 1, 0).sum()

        if (subfhr < minseq):

            fhr[s1:e1] = 0
    return fhr

def update(fhr, minseq):
    # convert the sequence to boolean
    nonzero = fhr.astype(bool)
    # stack the boolean array with lagged copies of itself
    seqs = np.stack([nonzero[i:-minseq+i] for i in range(minseq)],
                    axis=1)
    # find the spots where the sequence is long enough
    inseq = np.r_[np.zeros(minseq, np.bool), seqs.sum(axis=1) == minseq]
    # the start and end of the series is are assumed to be included in result
    inseq[minseq] = True
    inseq[-1] = True
    
    # make sure that the full sequence is included. 
    # There may be a way to vectorize this further
    for ind in np.where(inseq)[0]:
        inseq[ind-minseq:ind] = True
    # Apply the inseq array as a mask
    return inseq * fhr


fhr = np.array([1, 3, 1, 0, 0, 1, 8, 3, 0, 8, 2, 4, 7, 0,0,4,1])
minseq = 4 

print(np.all(orig(fhr, minseq) == update(fhr, minseq)))
# True
将numpy导入为np
定义来源(fhr、minseq):
p=np.where(fhr>0,1,0).aType(int)
s=np.array([1]+list(np.diff(p)))
sind=np,其中(s==1)[0][1:]
对于范围内的i(len(sind)-1):
s1=sind[i]
e1=sind[i+1]
subfhr=np.where(fhr[s1:e1]>0,1,0).sum()
如果(subfhr
只是好奇,为什么
8 2 4 7
系列没有变为零?@QuangHoang必须是一个窗口参数,因为它删除
1,8,3
,但不是
8 2 4 7
@QuangHoang 8,2,4,7长度是4,1,8,3长度是3。我会纠正这个问题
In [97]: a
Out[97]: array([1, 3, 1, 0, 0, 1, 8, 3, 0, 8, 2, 4, 7, 0, 0, 4, 1])

In [98]: remove_small_nnz(a, W=3)
Out[98]: array([1, 3, 1, 0, 0, 1, 8, 3, 0, 8, 2, 4, 7, 0, 0, 4, 1])

In [99]: remove_small_nnz(a, W=4)
Out[99]: array([1, 3, 1, 0, 0, 0, 0, 0, 0, 8, 2, 4, 7, 0, 0, 4, 1])

In [100]: remove_small_nnz(a, W=5)
Out[100]: array([1, 3, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 1])
import numpy as np

def orig(fhr, minseq):
    p = np.where(fhr>0, 1, 0).astype(int)
    s = np.array([1]+ list(np.diff(p)))
    sind = np.where(s==1)[0][1:]
    for i in range(len(sind) - 1):
        s1 = sind[i]
        e1 = sind[i+1]
        subfhr = np.where(fhr[s1:e1] > 0, 1, 0).sum()

        if (subfhr < minseq):

            fhr[s1:e1] = 0
    return fhr

def update(fhr, minseq):
    # convert the sequence to boolean
    nonzero = fhr.astype(bool)
    # stack the boolean array with lagged copies of itself
    seqs = np.stack([nonzero[i:-minseq+i] for i in range(minseq)],
                    axis=1)
    # find the spots where the sequence is long enough
    inseq = np.r_[np.zeros(minseq, np.bool), seqs.sum(axis=1) == minseq]
    # the start and end of the series is are assumed to be included in result
    inseq[minseq] = True
    inseq[-1] = True
    
    # make sure that the full sequence is included. 
    # There may be a way to vectorize this further
    for ind in np.where(inseq)[0]:
        inseq[ind-minseq:ind] = True
    # Apply the inseq array as a mask
    return inseq * fhr


fhr = np.array([1, 3, 1, 0, 0, 1, 8, 3, 0, 8, 2, 4, 7, 0,0,4,1])
minseq = 4 

print(np.all(orig(fhr, minseq) == update(fhr, minseq)))
# True