SARIMAX python np.linalg.linalg.linalgeror:LU分解错误
我对时间序列分析有问题。我有一个具有5个特征的数据集。以下是我的输入数据集的子集:SARIMAX python np.linalg.linalg.linalgeror:LU分解错误,python,time-series,arima,Python,Time Series,Arima,我对时间序列分析有问题。我有一个具有5个特征的数据集。以下是我的输入数据集的子集: date,price,year,day,totaltx 1/1/2016 0:00,434.46,2016,1,126762 1/2/2016 0:00,433.59,2016,2,147449 1/3/2016 0:00,430.36,2016,3,148661 1/4/2016 0:00,433.49,2016,4,185279 1/5/2016 0:00,432.25,2016,5,178723 1/6/2
date,price,year,day,totaltx
1/1/2016 0:00,434.46,2016,1,126762
1/2/2016 0:00,433.59,2016,2,147449
1/3/2016 0:00,430.36,2016,3,148661
1/4/2016 0:00,433.49,2016,4,185279
1/5/2016 0:00,432.25,2016,5,178723
1/6/2016 0:00,429.46,2016,6,184207
我的内生数据是价格列,外生数据是totaltx价格
这是我正在运行的代码,出现错误:
import statsmodels.api as sm
import pandas as pd
import numpy as np
from numpy.linalg import LinAlgError
def arima(filteredData, coinOutput, window, horizon, trainLength):
start_index = 0
end_index = 0
inputNumber = filteredData.shape[0]
predictions = np.array([], dtype=np.float32)
prices = np.array([], dtype=np.float32)
# sliding on time series data with 1 day step
while ((end_index) < inputNumber - 1):
end_index = start_index + trainLength
trainFeatures = filteredData[start_index:end_index]["totaltx"]
trainOutput = coinOutput[start_index:end_index]["price"]
arima = sm.tsa.statespace.SARIMAX(endog=trainOutput.values, exog=trainFeatures.values, order=(window, 0, 0))
arima_fit = arima.fit(disp=0)
testdata=filteredData[end_index:end_index+1]["totaltx"]
total_sample = end_index-start_index
predicted = arima_fit.predict(start=total_sample, end=total_sample, exog=np.array(testdata.values).reshape(-1,1))
price = coinOutput[end_index:end_index + 1]["price"].values
predictions = np.append(predictions, predicted)
prices = np.append(prices, price)
start_index = start_index + 1
return predictions, prices
def processCoins(bitcoinPrice, window, horizon):
output = bitcoinPrice[horizon:][["date", "day", "year", "price"]]
return output
trainLength=100;
for window in [3,5]:
for horizon in [1,2,5,7,10]:
bitcoinPrice = pd.read_csv("..\\prices.csv", sep=",")
coinOutput = processCoins(bitcoinPrice, window, horizon)
predictions, prices = arima(bitcoinPrice, coinOutput, window, horizon, trainLength)
这看起来可能是个bug。同时,您可以通过使用不同的初始化来解决此问题,如下所示:
arima = sm.tsa.statespace.SARIMAX(
endog=trainOutput.values, exog=trainFeatures.values, order=(window, 0, 0),
initialization='approximate_diffuse')
如果有机会,请在提交错误报告 我不认为这是一个bug,我已经用pmdarima的1.8和1.7.1版本测试了我的代码,并且在同一系列中不断得到相同的错误。
arima = sm.tsa.statespace.SARIMAX(
endog=trainOutput.values, exog=trainFeatures.values, order=(window, 0, 0),
initialization='approximate_diffuse')