Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/314.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/arrays/12.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
基于阈值创建插值渐变数组-Python/NumPy_Python_Arrays_Numpy - Fatal编程技术网

基于阈值创建插值渐变数组-Python/NumPy

基于阈值创建插值渐变数组-Python/NumPy,python,arrays,numpy,Python,Arrays,Numpy,我想测量填充某些条件(如停止时钟)的子阵列的长度,但一旦不再满足该条件,该值应重置为零。因此,结果数组应该告诉我,有多少值满足某些条件(例如,值>1): 应将结果放入以下数组中: [0, 0, 1, 2, 3, 4, 0, 1, 2, 0] 可以很容易地在python中定义一个函数,该函数返回相应的numy数组: def StopClock(signal, threshold=1): clock = [] current_time = 0 for item in si

我想测量填充某些条件(如停止时钟)的子阵列的长度,但一旦不再满足该条件,该值应重置为零。因此,结果数组应该告诉我,有多少值满足某些条件(例如,值>1):

应将结果放入以下数组中:

[0, 0, 1, 2, 3, 4, 0, 1, 2, 0]
可以很容易地在python中定义一个函数,该函数返回相应的numy数组:

def StopClock(signal, threshold=1):

    clock = []
    current_time = 0
    for item in signal:
        if item > threshold:
            current_time += 1
        else:
            current_time = 0
        clock.append(current_time)
    return np.array(clock)

StopClock([0, 0, 2, 2, 2, 2, 0, 3, 3, 0])

然而,我真的不喜欢这个for循环,特别是因为这个计数器应该在更长的数据集上运行。我想到了一些结合
np.diff
np.cumsum
解决方案,但是我没有完成重置部分。有人知道上述问题的更优雅的numpy风格的解决方案吗?

此解决方案使用pandas执行
groupby

s = pd.Series([0, 0, 2, 2, 2, 2, 0, 3, 3, 0])
threshold = 0
>>> np.where(
        s > threshold, 
        s
        .to_frame()  # Convert series to dataframe.
        .assign(_dummy_=1)  # Add column of ones.
        .groupby((s.gt(threshold) != s.gt(threshold).shift()).cumsum())['_dummy_']  # shift-cumsum pattern
        .transform(lambda x: x.cumsum()), # Cumsum the ones per group.
        0)  # Fill value with zero where threshold not exceeded.
array([0, 0, 1, 2, 3, 4, 0, 1, 2, 0])

另一个numpy解决方案:

import numpy as np
a = np.array([0, 0, 2, 2, 2, 2, 0, 3, 3, 0])
​
def stop_clock(signal, threshold=1):
    mask = signal > threshold
    indices = np.flatnonzero(np.diff(mask)) + 1
    return np.concatenate(list(map(np.cumsum, np.array_split(mask, indices))))
​
stop_clock(a)
# array([0, 0, 1, 2, 3, 4, 0, 1, 2, 0])

是的,我们可以使用
diff-styled differentication
cumsum
以矢量化的方式创建这样的插值渐变,这应该非常有效,特别是对于大型输入阵列。重置部分是通过在每个间隔结束时分配适当的值来完成的,其思想是在每个间隔结束时重置数字的总和

这里有一个实现来完成所有这些-

def intervaled_ramp(a, thresh=1):
    mask = a>thresh

    # Get start, stop indices
    mask_ext = np.concatenate(([False], mask, [False] ))
    idx = np.flatnonzero(mask_ext[1:] != mask_ext[:-1])
    s0,s1 = idx[::2], idx[1::2]

    out = mask.astype(int)
    valid_stop = s1[s1<len(a)]
    out[valid_stop] = s0[:len(valid_stop)] - valid_stop
    return out.cumsum()
运行时测试

进行公平基准测试的一种方法是使用问题中发布的示例,并将其分为许多次,并将其用作输入数组。有了这样的设置,时间安排如下-

In [841]: a = np.array([0, 0, 2, 2, 2, 2, 0, 3, 3, 0])

In [842]: a = np.tile(a,10000)

# @Alexander's soln
In [843]: %timeit pandas_app(a, threshold=1)
1 loop, best of 3: 3.93 s per loop

# @Psidom 's soln
In [844]: %timeit stop_clock(a, threshold=1)
10 loops, best of 3: 119 ms per loop

# Proposed in this post
In [845]: %timeit intervaled_ramp(a, thresh=1)
1000 loops, best of 3: 527 µs per loop

虽然Alexander的解决方案非常优雅,而Psidom的解决方案是最具可读性的解决方案,但由于速度的原因,该解决方案是完美的。谢谢大家!
Input (a) : 
[5 3 1 4 5 0 0 2 2 2 2 0 3 3 0 1 1 2 0 3 5 4 3 0 1]
Output (intervaled_ramp(a, thresh=1)) : 
[1 2 0 1 2 0 0 1 2 3 4 0 1 2 0 0 0 1 0 1 2 3 4 0 0]

Input (a) : 
[1 1 1 4 5 0 0 2 2 2 2 0 3 3 0 1 1 2 0 3 5 4 3 0 1]
Output (intervaled_ramp(a, thresh=1)) : 
[0 0 0 1 2 0 0 1 2 3 4 0 1 2 0 0 0 1 0 1 2 3 4 0 0]

Input (a) : 
[1 1 1 4 5 0 0 2 2 2 2 0 3 3 0 1 1 2 0 3 5 4 3 0 5]
Output (intervaled_ramp(a, thresh=1)) : 
[0 0 0 1 2 0 0 1 2 3 4 0 1 2 0 0 0 1 0 1 2 3 4 0 1]

Input (a) : 
[1 1 1 4 5 0 0 2 2 2 2 0 3 3 0 1 1 2 0 3 5 4 3 0 5]
Output (intervaled_ramp(a, thresh=0)) : 
[1 2 3 4 5 0 0 1 2 3 4 0 1 2 0 1 2 3 0 1 2 3 4 0 1]
In [841]: a = np.array([0, 0, 2, 2, 2, 2, 0, 3, 3, 0])

In [842]: a = np.tile(a,10000)

# @Alexander's soln
In [843]: %timeit pandas_app(a, threshold=1)
1 loop, best of 3: 3.93 s per loop

# @Psidom 's soln
In [844]: %timeit stop_clock(a, threshold=1)
10 loops, best of 3: 119 ms per loop

# Proposed in this post
In [845]: %timeit intervaled_ramp(a, thresh=1)
1000 loops, best of 3: 527 µs per loop