Recursion 如何使用pykalman滤波器更新进行在线回归
我想使用kf.filter\u update()对传入的价格数据流递归使用Kalman回归,但我无法让它工作。下面是构建问题的示例代码: 数据集(即流): 数据被读入数据帧,以下代码通过迭代df模拟流:Recursion 如何使用pykalman滤波器更新进行在线回归,recursion,filtering,regression,kalman-filter,Recursion,Filtering,Regression,Kalman Filter,我想使用kf.filter\u update()对传入的价格数据流递归使用Kalman回归,但我无法让它工作。下面是构建问题的示例代码: 数据集(即流): 数据被读入数据帧,以下代码通过迭代df模拟流: df = pd.read_csv('data.txt') df.dropna(inplace=True) history = {} history["spread"] = [] history["state_means"] = [] history["state_covs"] = [] fo
df = pd.read_csv('data.txt')
df.dropna(inplace=True)
history = {}
history["spread"] = []
history["state_means"] = []
history["state_covs"] = []
for idx, row in df.iterrows():
if idx == 0: # Initialize the Kalman filter
delta = 1e-9
trans_cov = delta / (1 - delta) * np.eye(2)
obs_mat = np.vstack([df.iloc[0].CAT, np.ones(df.iloc[0].CAT.shape)]).T[:, np.newaxis]
kf = KalmanFilter(n_dim_obs=1, n_dim_state=2,
initial_state_mean=np.zeros(2),
initial_state_covariance=np.ones((2, 2)),
transition_matrices=np.eye(2),
observation_matrices=obs_mat,
observation_covariance=1.0,
transition_covariance=trans_cov)
state_means, state_covs = kf.filter(np.asarray(df.iloc[0].DOG))
history["state_means"], history["state_covs"] = state_means, state_covs
slope=state_means[:, 0]
print "SLOPE", slope
else:
state_means, state_covs = kf.filter_update(history["state_means"][-1], history["state_covs"][-1], observation = np.asarray(df.iloc[idx].DOG))
history["state_means"].append(state_means)
history["state_covs"].append(state_covs)
slope=state_means[:, 0]
print "SLOPE", slope
卡尔曼滤波器正确初始化,我得到第一个回归系数,但随后的更新引发异常:
Traceback (most recent call last):
SLOPE [ 6.70319125]
File "C:/Users/.../KalmanUpdate_example.py", line 50, in <module>
KalmanOnline(df)
File "C:/Users/.../KalmanUpdate_example.py", line 43, in KalmanOnline
state_means, state_covs = kf.filter_update(history["state_means"][-1], history["state_covs"][-1], observation = np.asarray(df.iloc[idx].DOG))
File "C:\Python27\Lib\site-packages\pykalman\standard.py", line 1253, in filter_update
2, "observation_matrix"
File "C:\Python27\Lib\site-packages\pykalman\standard.py", line 38, in _arg_or_default
+ ' You must specify it manually.') % (name,)
ValueError: observation_matrix is not constant for all time. You must specify it manually.
Process finished with exit code 1
回溯(最近一次呼叫最后一次):
坡度[6.70319125]
文件“C:/Users/../KalmanUpdate_example.py”,第50行,在
卡尔马诺林(df)
KalmanOnline中第43行的文件“C:/Users/../KalmanUpdate_example.py”
状态表示,状态表示=kf.filter表示更新(历史[“状态表示”][-1],历史[“状态表示”][-1],观察=np.asarray(df.iloc[idx].DOG))
文件“C:\Python27\Lib\site packages\pykalman\standard.py”,第1253行,在filter\u update中
2,“观察矩阵”
文件“C:\Python27\Lib\site packages\pykalman\standard.py”,第38行,在_arg_或_default中
+“必须手动指定。”)%(名称,)
ValueError:观测矩阵并非始终不变。您必须手动指定它。
进程已完成,退出代码为1
直观地看,观察矩阵是必需的(它在初始步骤中提供,但在更新步骤中没有提供),但我不知道如何正确设置它。非常感谢您的反馈。Pykalman允许您以两种方式声明观测矩阵:
- [n_时间步,n_dim_obs,n_dim_obs]-整个估计一次
- [n_dim_obs,n_dim_obs]-分别用于每个估计步骤
from pykalman import KalmanFilter
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('data.txt')
df.dropna(inplace=True)
n = df.shape[0]
n_dim_state = 2;
history_state_means = np.zeros((n, n_dim_state))
history_state_covs = np.zeros((n, n_dim_state, n_dim_state))
for idx, row in df.iterrows():
if idx == 0: # Initialize the Kalman filter
delta = 1e-9
trans_cov = delta / (1 - delta) * np.eye(2)
obs_mat = [df.iloc[0].CAT, 1]
kf = KalmanFilter(n_dim_obs=1, n_dim_state=2,
initial_state_mean=np.zeros(2),
initial_state_covariance=np.ones((2, 2)),
transition_matrices=np.eye(2),
observation_matrices=obs_mat,
observation_covariance=1.0,
transition_covariance=trans_cov)
history_state_means[0], history_state_covs[0] = kf.filter(np.asarray(df.iloc[0].DOG))
slope=history_state_means[0, 0]
print "SLOPE", slope
else:
obs_mat = np.asarray([[df.iloc[idx].CAT, 1]])
history_state_means[idx], history_state_covs[idx] = kf.filter_update(history_state_means[idx-1],
history_state_covs[idx-1],
observation = df.iloc[idx].DOG,
observation_matrix=obs_mat)
slope=history_state_means[idx, 0]
print "SLOPE", slope
plt.figure(1)
plt.plot(history_state_means[:, 0], label="Slope")
plt.grid()
plt.show()
它会产生以下输出:
SLOPE 6.70322464199
SLOPE 6.70512037269
SLOPE 6.70337808649
SLOPE 6.69956406785
SLOPE 6.6961767953
SLOPE 6.69558438828
SLOPE 6.69581682668
SLOPE 6.69617670459
Pykalman没有很好的文档记录,官方页面上也有错误。这就是为什么我建议在一个步骤中使用离线估计来测试结果。在这种情况下,必须像在代码中那样声明观察矩阵
from pykalman import KalmanFilter
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('data.txt')
df.dropna(inplace=True)
delta = 1e-9
trans_cov = delta / (1 - delta) * np.eye(2)
obs_mat = np.vstack([df.iloc[:].CAT, np.ones(df.iloc[:].CAT.shape)]).T[:, np.newaxis]
kf = KalmanFilter(n_dim_obs=1, n_dim_state=2,
initial_state_mean=np.zeros(2),
initial_state_covariance=np.ones((2, 2)),
transition_matrices=np.eye(2),
observation_matrices=obs_mat,
observation_covariance=1.0,
transition_covariance=trans_cov)
state_means, state_covs = kf.filter(df.iloc[:].DOG)
print "SLOPE", state_means[:, 0]
plt.figure(1)
plt.plot(state_means[:, 0], label="Slope")
plt.grid()
plt.show()
结果是一样的。Anton,非常感谢您详细的回答,并花时间实际修改代码。现在,它的工作如预期。
from pykalman import KalmanFilter
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('data.txt')
df.dropna(inplace=True)
delta = 1e-9
trans_cov = delta / (1 - delta) * np.eye(2)
obs_mat = np.vstack([df.iloc[:].CAT, np.ones(df.iloc[:].CAT.shape)]).T[:, np.newaxis]
kf = KalmanFilter(n_dim_obs=1, n_dim_state=2,
initial_state_mean=np.zeros(2),
initial_state_covariance=np.ones((2, 2)),
transition_matrices=np.eye(2),
observation_matrices=obs_mat,
observation_covariance=1.0,
transition_covariance=trans_cov)
state_means, state_covs = kf.filter(df.iloc[:].DOG)
print "SLOPE", state_means[:, 0]
plt.figure(1)
plt.plot(state_means[:, 0], label="Slope")
plt.grid()
plt.show()