Python 运行中位数不包括零

Python 运行中位数不包括零,python,numpy,median,Python,Numpy,Median,我借用了一些代码来计算数组的运行中值。但是对于每个正在运行的数组,我想排除零值。代码如下: def RunningMedian(seq, M): seq = iter(seq) s = [] m = M // 2 # Set up list s (to be sorted) and load deque with first window of seq s = [item for item in islice(seq, M)] d = deque

我借用了一些代码来计算数组的运行中值。但是对于每个正在运行的数组,我想排除零值。代码如下:

def RunningMedian(seq, M):
    seq = iter(seq)
    s = []
    m = M // 2

    # Set up list s (to be sorted) and load deque with first window of seq
    s = [item for item in islice(seq, M)]
    d = deque(s)
    # Simple lambda function to handle even/odd window sizes    
    median = lambda : s[m] if bool(M&1) else (s[m-1]+s[m]) * 0.5
    # Sort it in increasing order and extract the median ("center" of the sorted window)
    s.sort()
    # remove zeros from the array
    s = np.trim_zeros(s)
    print s
    medians = [median()]
    for item in seq:
        old = d.popleft()          # pop oldest from left
        d.append(item)             # push newest in from right
        del s[bisect_left(s, old)] # locate insertion point and then remove old 
        insort(s, item)            # insert newest such that new sort is not required        
        s = np.trim_zeros(s)
        print s
        medians.append(median())
    return medians
我正在测试代码,但失败了。我的例子是
a=np.array([520944268])
,我把这个函数称为
RunningMedian(a,3)
。每个跑步箱我想要的是:

[2,5]
[2,9]
[4,9]
[2,4,9]
[2,4,6]
[2,6,8]
但是,在调用上述函数后,它给出:

[2, 5]
[2, 9]
[4, 9]
[2, 9]
[2, 6]
[2, 8]
它还返回错误的中值
调用返回的中值为:
[5,9,9,9,6,8]

有人能帮我纠正这个问题吗?谢谢。

试试:

[s[s!=0] for s in np.dstack((a[:-2], a[1:-1], a[2:]))[0]]
尝试:


代码的主要问题是,在
s
中扔掉零会弄乱所使用对象的长度,这解释了为什么最后没有得到3个长度的窗口

我建议另一种方法:对中位数使用适当的函数,并在局部忽略这些零值。这样,它更干净,而且您不需要修剪零(仅为此导入
numpy
是非常糟糕的做法)。根据你的功能,我得出以下结论:

from itertools import islice
from collections import deque
from bisect import bisect_left,insort

def median(s):
    sp = [nz for nz in s if nz!=0]
    print(sp)
    Mnow = len(sp)
    mnow = Mnow // 2
    return sp[mnow] if bool(Mnow&1) else (sp[mnow-1]+sp[mnow])*0.5

def RunningMedian(seq, M):
    seq = iter(seq)
    s = []
    m = M // 2

    # Set up list s (to be sorted) and load deque with first window of seq
    s = [item for item in islice(seq, M)]
    d = deque(s)
    ## Simple lambda function to handle even/odd window sizes    
    #median = lambda: s[m] if bool(M&1) else (s[m-1]+s[m])*0.5

    # Sort it in increasing order and extract the median ("center" of the sorted window)
    s.sort()
    medians = [median(s)]
    for item in seq:
        old = d.popleft()          # pop oldest from left
        d.append(item)             # push newest in from right
        del s[bisect_left(s, old)] # locate insertion point and then remove old 
        insort(s, item)            # insert newest such that new sort is not required        
        medians.append(median(s))
    return medians
大部分更改都在新的
中值
函数中,我将打印移到了那里。我还添加了您的导入。请注意,我会以非常不同的方式处理这个问题,并且很可能当前的“固定”版本有鸭子磁带的味道

无论如何,它似乎按照您的要求工作:

>>> a = [5, 2, 0, 9, 4, 2, 6, 8]

>>> RunningMedian(a,3)
[2, 5]
[2, 9]
[4, 9]
[2, 4, 9]
[2, 4, 6]
[2, 6, 8]
[3.5, 5.5, 6.5, 4, 4, 6]

在您的版本中,中间值关闭的原因是窗口的奇偶性是由输入窗口宽度
M
确定的。如果放弃零,最终将得到更小(甚至更长)的窗口。在这种情况下,你不需要中间(=第二)元素,但是你需要把中间的两个元素平均。因此,您的输出是错误的。

代码的主要问题是,在
s
中丢弃零会弄乱所用对象的长度,这解释了为什么您最后没有得到3个长度的窗口

我建议另一种方法:对中位数使用适当的函数,并在局部忽略这些零值。这样,它更干净,而且您不需要修剪零(仅为此导入
numpy
是非常糟糕的做法)。根据你的功能,我得出以下结论:

from itertools import islice
from collections import deque
from bisect import bisect_left,insort

def median(s):
    sp = [nz for nz in s if nz!=0]
    print(sp)
    Mnow = len(sp)
    mnow = Mnow // 2
    return sp[mnow] if bool(Mnow&1) else (sp[mnow-1]+sp[mnow])*0.5

def RunningMedian(seq, M):
    seq = iter(seq)
    s = []
    m = M // 2

    # Set up list s (to be sorted) and load deque with first window of seq
    s = [item for item in islice(seq, M)]
    d = deque(s)
    ## Simple lambda function to handle even/odd window sizes    
    #median = lambda: s[m] if bool(M&1) else (s[m-1]+s[m])*0.5

    # Sort it in increasing order and extract the median ("center" of the sorted window)
    s.sort()
    medians = [median(s)]
    for item in seq:
        old = d.popleft()          # pop oldest from left
        d.append(item)             # push newest in from right
        del s[bisect_left(s, old)] # locate insertion point and then remove old 
        insort(s, item)            # insert newest such that new sort is not required        
        medians.append(median(s))
    return medians
大部分更改都在新的
中值
函数中,我将打印移到了那里。我还添加了您的导入。请注意,我会以非常不同的方式处理这个问题,并且很可能当前的“固定”版本有鸭子磁带的味道

无论如何,它似乎按照您的要求工作:

>>> a = [5, 2, 0, 9, 4, 2, 6, 8]

>>> RunningMedian(a,3)
[2, 5]
[2, 9]
[4, 9]
[2, 4, 9]
[2, 4, 6]
[2, 6, 8]
[3.5, 5.5, 6.5, 4, 4, 6]
在您的版本中,中间值关闭的原因是窗口的奇偶性是由输入窗口宽度
M
确定的。如果放弃零,最终将得到更小(甚至更长)的窗口。在这种情况下,你不需要中间(=第二)元素,但是你需要把中间的两个元素平均。因此,您的输出是错误的