Python 变步长抽取函数的优化_Python_Optimization

Python 变步长抽取函数的优化

python optimization

Python 变步长抽取函数的优化,python,optimization,Python,Optimization,我正在处理模拟波形，需要大大减小它们的大小，然后才能将它们保存到磁盘上仿真数据与传统的采样数据有一点不同，它有一个动态的时间步长。这只允许在某些情况下提高时间分辨率，而在什么都没有的情况下减少数据，但这还远远不够完美。因为模拟器会在某处发生事件时编写一个时间快照，所以可以得到许多曲线，这些曲线具有规律性的小时间步长，而不需要它。表示大量额外数据（高达1000%的开销，转化为每天200Gb的垃圾数据）我已经有了一个实现，它可以做到这一点，但是这个函数真的很慢。速度如此之慢，占用了大约20%的处

我正在处理模拟波形，需要大大减小它们的大小，然后才能将它们保存到磁盘上

仿真数据与传统的采样数据有一点不同，它有一个动态的时间步长。这只允许在某些情况下提高时间分辨率，而在什么都没有的情况下减少数据，但这还远远不够完美。因为模拟器会在某处发生事件时编写一个时间快照，所以可以得到许多曲线，这些曲线具有规律性的小时间步长，而不需要它。表示大量额外数据（高达1000%的开销，转化为每天200Gb的垃圾数据）

我已经有了一个实现，它可以做到这一点，但是这个函数真的很慢。速度如此之慢，占用了大约20%的处理器时间（本应保留用于模拟）

我的函数执行以下操作：

取3个连续点

线性插值中心点的y值

将插值与实际值（绝对和相对误差）进行比较

如果错误很小，则标记所述删除点

循环回到1

测试完所有点后，删除该步骤中删除的点旁边的所有点（简单地删除所有标记点可能会删除所有点，因为线性插值会丢失其参考数据）。重新启动，直到没有（或很少）点被删除

代码如下：

#return true if
#    the differnce between the two points a&p is smaller than the absolute error ea
#    the  relative error between a&p is smaller than the proportionnal error ep
def markForRemoval(a,p,ea,ep):
    ret = (abs(a-p)<ea)
    if a != 0:
        e = abs(abs(p/a)-1)
        ret = ret or e<ep
    return ret

def inter(x0,x1,x2,y0,y2):
    d = (y0-y2)/(x0-x2)
    return d*(x1-x0)+y0

def checkBeforeDel(showMe):
    last = True #first element is alway kept
    for idx,items in enumerate(showMe):
        if showMe[idx] == False and last == False: # if previous element has been removed, do not accept second removal
            showMe[idx]=True #this element has been unmarked and won't be deleted in this step
        last=showMe[idx]#tell if this loop resulted in a removal or not
    return showMe

def decimate(t,d,ea,ep):
    #get time shifted vector so that a list comprehension can be used
    zipped = zip(t[2:],t[1:-1],t[:-2],d[2:],d[1:-1],d[:-2])
    #create a mask for np. Needs to invert mark for removal to get a show/hide mask instead
    show = np.concatenate(
        ([True],#always keep first and last element
          ~np.array([markForRemoval(inter(t0,t1,t2,d0,d2), d1, ea, ep) for [t0,t1,t2,d0,d1,d2] in zipped]),
         [True]))

    show = checkBeforeDel(show)
    tret = t[show]#decimate the time vector using constructed mask
    dret = d[show]#decimate the data vector using constructed mask
    dec = len(t)-np.sum(show)#tell how many points have been decimated
    return (tret, dret, dec)#return the whole stuff

对于1000万个积分，这在我的电脑上需要10到50秒，具体取决于删除的积分数量。开始时，它删除了几乎一半的数据，但每次运行都会下降（这就是为什么“最后”1000点被保留的原因）

对于这种情况，很少有需要考虑的事项：

这不是一个真正的抽取-我不是每X个样本抽取一个

不能在一次运行中删除两个连续的样本，即使进一步运行可能会删除以前收到赦免的样本

我看到了优化的三点，但我不知道如何检查我应该关注哪一点（我知道如何对整个过程进行计时，而不是一个子集）

markforremove

功能

列出理解力。我是一个C/ASM的家伙&python正在折磨我的大脑——有太多的方法来做同样的事情。最后，我甚至不确定这是否有利于速度

连续样本的抽取前检查。我很想把这一条列入上面的理解清单，即使只是为了满足我对代码优雅的渴望。可以肯定的是，对于当前构造，我在同一个向量上循环两次；一次是白费的

编辑：在上面添加缺少的内部函数+在下面添加探查器输出代码和结果

import cProfile

def decimateAll(t,d,ea,ep,stopAt):
    dec=stopAt
    while dec>=stopAt:
        [t,d, dec] = decimate(t,d,ErrorAbsolute,ErrorProportionnal)
    return (t,d)

cProfile.run('decimateAll(time,Vout,ErrorAbsolute,ErrorProportionnal,1000)')

49992551函数调用只需41.260秒

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  9998483 5.609    0.000    5.609    0.000 <ipython-input-24-22d73b8bfa1a>:12(inter)
   12   15.999    1.333   15.999    1.333 <ipython-input-24-22d73b8bfa1a>:17(checkBeforeDel)
  9998483 10.750    0.000   12.524    0.000 <ipython-input-24-22d73b8bfa1a>:4(markForRemoval)
   12    0.140    0.012   41.252    3.438 <ipython-input-25-3960a7141d64>:1(decimate)
    1    0.008    0.008   41.260   41.260 <ipython-input-25-3960a7141d64>:25(decimateAll)
   12    6.515    0.543   24.647    2.054 <ipython-input-25-3960a7141d64>:7(<listcomp>)
    1    0.000    0.000   41.260   41.260 <string>:1(<module>)
   12    0.000    0.000    0.010    0.001 fromnumeric.py:1821(sum)
   12    0.000    0.000    0.010    0.001 fromnumeric.py:64(_wrapreduction)
 29995449 1.774    0.000    1.774    0.000 {built-in method builtins.abs}
    1    0.000    0.000   41.260   41.260 {built-in method builtins.exec}
   12    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}
   12    0.000    0.000    0.000    0.000 {built-in method builtins.len}
   12    0.452    0.038    0.452    0.038 {built-in method numpy.core.multiarray.array}
   12    0.005    0.000    0.005    0.000 {built-in method numpy.core.multiarray.concatenate}
    1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
   12    0.000    0.000    0.000    0.000 {method 'items' of 'dict' objects}
   12    0.010    0.001    0.010    0.001 {method 'reduce' of 'numpy.ufunc' objects}

订购人：标准名称
ncalls tottime percall cumtime percall文件名：lineno（函数）
9998483 5.609 0.000 5.609 0.000:12（国际）
12 15.999 1.333 15.999 1.333:17（checkBeforeDel）
9998483 10.750 0.000 12.524 0.000:4（拆卸标记）
120.140 0.012 41.252 3.438:1（抽取）
1 0.008 0.008 41.260 41.260:25（全数抽取）
12    6.515    0.543   24.647    2.054 :7()
1    0.000    0.000   41.260   41.260 :1()
12 0.000 0.000 0.010 0.001自数值。py:1821（总和）
12 0.000 0.000 0.010 0.001 from numeric.py:64（_wrapreduction）
29995449 1.774 0.000 1.774 0.000{内置方法builtins.abs}
1 0.000 0.000 41.260 41.260{内置方法builtins.exec}
12 0.000 0.000 0.000 0.000{内置方法内置.isinstance}
12 0.000 0.000 0.000 0.000{内置方法内置.len}
12 0.452 0.038 0.452 0.038{内置方法numpy.core.multiarray.array}
12 0.005 0.000 0.005 0.000{内置方法numpy.core.multiarray.concatenate}
1 0.000 0.000 0.000 0.000{方法'disable'的''lsprof.Profiler'对象}
12 0.000 0.000 0.000 0.000{“dict”对象的方法“items”}
12 0.010 0.001 0.010 0.001{“numpy.ufunc”对象的方法“reduce”}

如果我正确理解了概要文件输出，那么关心不删除两个元素的函数会产生巨大的开销。这完全出乎我的意料

尽管itt听起来你可能已经这么做了，如果没有，那就看吧。除了改进当前的实现之外，唯一能做得更多的方法就是找到并实现一个更好的算法；或者至少是关于如何改进它的建议——我不敢说我第一次尝试就得到了最优化的版本。剖析（感谢链接btw）是锦上添花，但即使如此；如果不知道如何加速代码的任何部分，这并不能真正解决我的问题。运行探查器只是第一步——它告诉您在哪里花费时间进行优化，以最大限度地利用工作。如果不实现不同的算法，您可能会改进正在使用的算法的实现。还有一个第三方工具，它可以获取更细粒度的执行信息——再次指导您将时间集中到哪里。

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
  9998483 5.609    0.000    5.609    0.000 <ipython-input-24-22d73b8bfa1a>:12(inter)
   12   15.999    1.333   15.999    1.333 <ipython-input-24-22d73b8bfa1a>:17(checkBeforeDel)
  9998483 10.750    0.000   12.524    0.000 <ipython-input-24-22d73b8bfa1a>:4(markForRemoval)
   12    0.140    0.012   41.252    3.438 <ipython-input-25-3960a7141d64>:1(decimate)
    1    0.008    0.008   41.260   41.260 <ipython-input-25-3960a7141d64>:25(decimateAll)
   12    6.515    0.543   24.647    2.054 <ipython-input-25-3960a7141d64>:7(<listcomp>)
    1    0.000    0.000   41.260   41.260 <string>:1(<module>)
   12    0.000    0.000    0.010    0.001 fromnumeric.py:1821(sum)
   12    0.000    0.000    0.010    0.001 fromnumeric.py:64(_wrapreduction)
 29995449 1.774    0.000    1.774    0.000 {built-in method builtins.abs}
    1    0.000    0.000   41.260   41.260 {built-in method builtins.exec}
   12    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}
   12    0.000    0.000    0.000    0.000 {built-in method builtins.len}
   12    0.452    0.038    0.452    0.038 {built-in method numpy.core.multiarray.array}
   12    0.005    0.000    0.005    0.000 {built-in method numpy.core.multiarray.concatenate}
    1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
   12    0.000    0.000    0.000    0.000 {method 'items' of 'dict' objects}
   12    0.010    0.001    0.010    0.001 {method 'reduce' of 'numpy.ufunc' objects}