Python 变步长抽取函数的优化

Python 变步长抽取函数的优化,python,optimization,Python,Optimization,我正在处理模拟波形,需要大大减小它们的大小,然后才能将它们保存到磁盘上 仿真数据与传统的采样数据有一点不同,它有一个动态的时间步长。这只允许在某些情况下提高时间分辨率,而在什么都没有的情况下减少数据,但这还远远不够完美。因为模拟器会在某处发生事件时编写一个时间快照,所以可以得到许多曲线,这些曲线具有规律性的小时间步长,而不需要它。表示大量额外数据(高达1000%的开销,转化为每天200Gb的垃圾数据) 我已经有了一个实现,它可以做到这一点,但是这个函数真的很慢。速度如此之慢,占用了大约20%的处





  • 取3个连续点
  • 线性插值中心点的y值
  • 将插值与实际值(绝对和相对误差)进行比较
  • 如果错误很小,则标记所述删除点
  • 循环回到1
  • 测试完所有点后,删除该步骤中删除的点旁边的所有点(简单地删除所有标记点可能会删除所有点,因为线性插值会丢失其参考数据)。重新启动,直到没有(或很少)点被删除


    #return true if
    #    the differnce between the two points a&p is smaller than the absolute error ea
    #    the  relative error between a&p is smaller than the proportionnal error ep
    def markForRemoval(a,p,ea,ep):
        ret = (abs(a-p)<ea)
        if a != 0:
            e = abs(abs(p/a)-1)
            ret = ret or e<ep
        return ret
    def inter(x0,x1,x2,y0,y2):
        d = (y0-y2)/(x0-x2)
        return d*(x1-x0)+y0
    def checkBeforeDel(showMe):
        last = True #first element is alway kept
        for idx,items in enumerate(showMe):
            if showMe[idx] == False and last == False: # if previous element has been removed, do not accept second removal
                showMe[idx]=True #this element has been unmarked and won't be deleted in this step
            last=showMe[idx]#tell if this loop resulted in a removal or not
        return showMe
    def decimate(t,d,ea,ep):
        #get time shifted vector so that a list comprehension can be used
        zipped = zip(t[2:],t[1:-1],t[:-2],d[2:],d[1:-1],d[:-2])
        #create a mask for np. Needs to invert mark for removal to get a show/hide mask instead
        show = np.concatenate(
            ([True],#always keep first and last element
              ~np.array([markForRemoval(inter(t0,t1,t2,d0,d2), d1, ea, ep) for [t0,t1,t2,d0,d1,d2] in zipped]),
        show = checkBeforeDel(show)
        tret = t[show]#decimate the time vector using constructed mask
        dret = d[show]#decimate the data vector using constructed mask
        dec = len(t)-np.sum(show)#tell how many points have been decimated
        return (tret, dret, dec)#return the whole stuff


  • 这不是一个真正的抽取-我不是每X个样本抽取一个
  • 不能在一次运行中删除两个连续的样本,即使进一步运行可能会删除以前收到赦免的样本
  • 我看到了优化的三点,但我不知道如何检查我应该关注哪一点(我知道如何对整个过程进行计时,而不是一个子集)

  • markforremove
  • 列出理解力。我是一个C/ASM的家伙&python正在折磨我的大脑——有太多的方法来做同样的事情。最后,我甚至不确定这是否有利于速度
  • 连续样本的抽取前检查。我很想把这一条列入上面的理解清单,即使只是为了满足我对代码优雅的渴望。可以肯定的是,对于当前构造,我在同一个向量上循环两次;一次是白费的
  • 编辑:在上面添加缺少的内部函数+在下面添加探查器输出代码和结果

    import cProfile
    def decimateAll(t,d,ea,ep,stopAt):
        while dec>=stopAt:
            [t,d, dec] = decimate(t,d,ErrorAbsolute,ErrorProportionnal)
        return (t,d)'decimateAll(time,Vout,ErrorAbsolute,ErrorProportionnal,1000)')

    Ordered by: standard name
    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      9998483 5.609    0.000    5.609    0.000 <ipython-input-24-22d73b8bfa1a>:12(inter)
       12   15.999    1.333   15.999    1.333 <ipython-input-24-22d73b8bfa1a>:17(checkBeforeDel)
      9998483 10.750    0.000   12.524    0.000 <ipython-input-24-22d73b8bfa1a>:4(markForRemoval)
       12    0.140    0.012   41.252    3.438 <ipython-input-25-3960a7141d64>:1(decimate)
        1    0.008    0.008   41.260   41.260 <ipython-input-25-3960a7141d64>:25(decimateAll)
       12    6.515    0.543   24.647    2.054 <ipython-input-25-3960a7141d64>:7(<listcomp>)
        1    0.000    0.000   41.260   41.260 <string>:1(<module>)
       12    0.000    0.000    0.010    0.001
       12    0.000    0.000    0.010    0.001
     29995449 1.774    0.000    1.774    0.000 {built-in method builtins.abs}
        1    0.000    0.000   41.260   41.260 {built-in method builtins.exec}
       12    0.000    0.000    0.000    0.000 {built-in method builtins.isinstance}
       12    0.000    0.000    0.000    0.000 {built-in method builtins.len}
       12    0.452    0.038    0.452    0.038 {built-in method numpy.core.multiarray.array}
       12    0.005    0.000    0.005    0.000 {built-in method numpy.core.multiarray.concatenate}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
       12    0.000    0.000    0.000    0.000 {method 'items' of 'dict' objects}
       12    0.010    0.001    0.010    0.001 {method 'reduce' of 'numpy.ufunc' objects}
