在绘图中查找Python模式_Python_Python 3.x_Numpy_Scipy_Time Series

在绘图中查找Python模式

python python-3.x numpy

在绘图中查找Python模式,python,python-3.x,numpy,scipy,time-series,Python,Python 3.x,Numpy,Scipy,Time Series,此图由以下gnuplot脚本生成。在以下链接中可以找到estimated.csv文件：我想找到上一个图的估计信号模式，接近下一个图。我的地面真相（实际信号如下图所示）这是我最初的做法 #!/usr/bin/env python import sys import numpy as np from shapely.geometry import LineString #--------------------------------------------------------------

此图由以下

gnuplot

脚本生成。在以下链接中可以找到

estimated.csv

文件：

我想找到上一个图的估计信号模式，接近下一个图。我的地面真相（实际信号如下图所示）

这是我最初的做法

#!/usr/bin/env python
import sys

import numpy as np
from shapely.geometry import LineString
#-------------------------------------------------------------------------------
def load_data(fname):
    return LineString(np.genfromtxt(fname, delimiter = ','))
#-------------------------------------------------------------------------------
lines = list(map(load_data, sys.argv[1:]))

for g in lines[0].intersection(lines[1]):
    if g.geom_type != 'Point':
        continue
    print('%f,%f' % (g.x, g.y))

然后在my

gnuplot

中直接调用此python脚本，如下所示：

set terminal pngcairo
set output 'fig.png'

set datafile separator comma
set yr [0:700]
set xr [0:10]

set xtics 0,2,10
set ytics 0,100,700

set grid

set xlabel "Time [seconds]"
set ylabel "Segments"

plot \
    'estimated.csv' w l lc rgb 'dark-blue' t 'Estimated', \
    'actual.csv' w l lc rgb 'green' t 'Actual', \
    '<python filter.py estimated.csv actual.csv' w p lc rgb 'red' ps 0.5 pt 7 t ''

设置终端pngcairo
设置输出“fig.png”
设置数据文件分隔符逗号
设定年份[0:700]
设置xr[0:10]
将xtics设置为0,2,10
设置ytics 0100700
设置网格
设置xlabel“时间[秒]”
设置标签“段”
密谋\
'estimated.csv'w l lc rgb'深蓝色't'estimated'\
'actual.csv'带lc rgb'绿色't'实际'\
“我认为pandas.rolling_max（）
是正确的方法。我们正在将数据加载到数据帧中，并计算超过8500个值的滚动最大值。之后，曲线看起来很相似。您可以稍微测试一下参数，以优化结果
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
plt.ion()
names = ['actual.csv','estimated.csv']
#-------------------------------------------------------------------------------
def load_data(fname):
    return np.genfromtxt(fname, delimiter = ',')
#-------------------------------------------------------------------------------

data = [load_data(name) for name in names]
actual_data = data[0]
estimated_data = data[1]
df = pd.read_csv('estimated.csv', names=('x','y'))
df['rolling_max'] = pd.rolling_max(df['y'],8500)
plt.figure()
plt.plot(actual_data[:,0],actual_data[:,1], label='actual')
plt.plot(estimated_data[:,0],estimated_data[:,1], label='estimated')
plt.plot(df['x'], df['rolling_max'], label = 'rolling')

plt.legend()
plt.title('Actual vs. Interpolated')
plt.xlim(0,10)
plt.ylim(0,500)
plt.xlabel('Time [Seconds]')
plt.ylabel('Segments')
plt.grid()
plt.show(block=True)


回答评论中的问题：
由于pd.rolling（）
正在生成数据的定义窗口，因此pd.rolling（）.max的第一个值将是NaN
。要替换这些NaN
s，我建议将整个系列转过来，并向后计算窗口。然后，我们可以用向后计算的值替换所有的NaN
s。我调整了窗口的长度以便向后计算。否则我们会得到错误的数据
此代码适用于：
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
plt.ion()

df = pd.read_csv('estimated.csv', names=('x','y'))
df['rolling_max'] = df['y'].rolling(8500).max()
df['rolling_max_backwards'] = df['y'][::-1].rolling(850).max()
df.rolling_max.fillna(df.rolling_max_backwards, inplace=True)
plt.figure()
plt.plot(df['x'], df['rolling_max'], label = 'rolling')

plt.legend()
plt.title('Actual vs. Interpolated')
plt.xlim(0,10)
plt.ylim(0,700)
plt.xlabel('Time [Seconds]')
plt.ylabel('Segments')
plt.grid()
plt.show(block=True)

我们得到以下结果：
您是否尝试过卡尔曼滤波器，它应该按照您希望的方式跟踪曲线。基本上，它尝试以一定的速度“跟踪”您的曲线，因此它会平滑您的信号，但现在我认为它在您的情况下不起作用。：/消除噪音并找到“真实”信号是很好的，但对您的情况没有多大帮助，仍然需要检查以备将来需要。好的，谢谢。我会读的。通过一些想法，你可以使用一些峰值检测算法，然后使用像DBSCAN这样的聚类算法来消除oulier，最后使用Kalman滤波器来平滑这一切你好，弗兰兹。非常感谢你。不要使用actual.csv
文件。这是我的基本事实，不应该被输入到程序中。仅从estimated.csv
检测到该模式。该模式不用于计算。我只是用它来表示相似性。如果愿意，请删除actual.csv
的行。我在代码中添加了一个解决方案来处理缺少的值。这并不像我希望的那么干净，但它能工作。嗨，弗兰兹，你以前用过卡尔曼滤波器吗？我想知道我们是否可以尝试用卡尔曼滤波
模型拟合同样的数据。谢谢嗨，德斯塔，不幸的是，我不是真的喜欢卡尔曼滤波器。我刚读到他们，但目前还不需要。您是否尝试过类似的pykalman：？
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
plt.ion()

df = pd.read_csv('estimated.csv', names=('x','y'))
df['rolling_max'] = df['y'].rolling(8500).max()
df['rolling_max_backwards'] = df['y'][::-1].rolling(850).max()
df.rolling_max.fillna(df.rolling_max_backwards, inplace=True)
plt.figure()
plt.plot(df['x'], df['rolling_max'], label = 'rolling')

plt.legend()
plt.title('Actual vs. Interpolated')
plt.xlim(0,10)
plt.ylim(0,700)
plt.xlabel('Time [Seconds]')
plt.ylabel('Segments')
plt.grid()
plt.show(block=True)