SARIMAX python np.linalg.linalg.linalgeror:LU分解错误_Python_Time Series_Arima

SARIMAX python np.linalg.linalg.linalgeror:LU分解错误

python

SARIMAX python np.linalg.linalg.linalgeror:LU分解错误,python,time-series,arima,Python,Time Series,Arima,我对时间序列分析有问题。我有一个具有5个特征的数据集。以下是我的输入数据集的子集： date,price,year,day,totaltx 1/1/2016 0:00,434.46,2016,1,126762 1/2/2016 0:00,433.59,2016,2,147449 1/3/2016 0:00,430.36,2016,3,148661 1/4/2016 0:00,433.49,2016,4,185279 1/5/2016 0:00,432.25,2016,5,178723 1/6/2

我对时间序列分析有问题。我有一个具有5个特征的数据集。以下是我的输入数据集的子集：

date,price,year,day,totaltx
1/1/2016 0:00,434.46,2016,1,126762
1/2/2016 0:00,433.59,2016,2,147449
1/3/2016 0:00,430.36,2016,3,148661
1/4/2016 0:00,433.49,2016,4,185279
1/5/2016 0:00,432.25,2016,5,178723
1/6/2016 0:00,429.46,2016,6,184207

我的内生数据是价格列，外生数据是totaltx价格

这是我正在运行的代码，出现错误：

import statsmodels.api as sm
import pandas as pd
import numpy as np
from numpy.linalg import LinAlgError

def arima(filteredData, coinOutput, window, horizon, trainLength):
    start_index = 0
    end_index = 0
    inputNumber = filteredData.shape[0]
    predictions = np.array([], dtype=np.float32)
    prices = np.array([], dtype=np.float32)
    # sliding on time series data with 1 day step
    while ((end_index) < inputNumber - 1):
        end_index = start_index + trainLength
        trainFeatures = filteredData[start_index:end_index]["totaltx"]
        trainOutput = coinOutput[start_index:end_index]["price"]

        arima = sm.tsa.statespace.SARIMAX(endog=trainOutput.values, exog=trainFeatures.values, order=(window, 0, 0))
        arima_fit = arima.fit(disp=0)
        testdata=filteredData[end_index:end_index+1]["totaltx"]
        total_sample = end_index-start_index
        predicted = arima_fit.predict(start=total_sample, end=total_sample, exog=np.array(testdata.values).reshape(-1,1))
        price = coinOutput[end_index:end_index + 1]["price"].values

        predictions = np.append(predictions, predicted)
        prices = np.append(prices, price)

        start_index = start_index + 1
    return predictions, prices

def processCoins(bitcoinPrice, window, horizon):
    output = bitcoinPrice[horizon:][["date", "day", "year", "price"]]
    return output

trainLength=100;
for window in [3,5]:
    for horizon in [1,2,5,7,10]:
        bitcoinPrice = pd.read_csv("..\\prices.csv", sep=",")
        coinOutput = processCoins(bitcoinPrice, window, horizon)
        predictions, prices = arima(bitcoinPrice, coinOutput, window, horizon, trainLength)

这看起来可能是个bug。同时，您可以通过使用不同的初始化来解决此问题，如下所示：

arima = sm.tsa.statespace.SARIMAX(
    endog=trainOutput.values, exog=trainFeatures.values, order=(window, 0, 0),
    initialization='approximate_diffuse')

如果有机会，请在提交错误报告

我不认为这是一个bug，我已经用pmdarima的1.8和1.7.1版本测试了我的代码，并且在同一系列中不断得到相同的错误。

arima = sm.tsa.statespace.SARIMAX(
    endog=trainOutput.values, exog=trainFeatures.values, order=(window, 0, 0),
    initialization='approximate_diffuse')