Scikit learn ValueError:找到样本数不一致的输入变量[Sklearn和时间序列]
我是python新手,我正在尝试学习如何使用Sklearn处理我自己的数据。我基本上只是从我的ARIMA模型和离散小波变换的近似值中得到残差,然后尝试拟合它以得到更好的模型。但是,我收到了以下错误消息:Scikit learn ValueError:找到样本数不一致的输入变量[Sklearn和时间序列],scikit-learn,time-series,Scikit Learn,Time Series,我是python新手,我正在尝试学习如何使用Sklearn处理我自己的数据。我基本上只是从我的ARIMA模型和离散小波变换的近似值中得到残差,然后尝试拟合它以得到更好的模型。但是,我收到了以下错误消息: File "C:\Users\Guilherme\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py", line 330, in _fit X, y = self._validate_i
File "C:\Users\Guilherme\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py", line 330, in _fit
X, y = self._validate_input(X, y, incremental)
File "C:\Users\Guilherme\Anaconda3\lib\site-packages\sklearn\neural_network\multilayer_perceptron.py", line 902, in _validate_input
multi_output=True)
File "C:\Users\Guilherme\Anaconda3\lib\site-packages\sklearn\utils\validation.py", line 531, in check_X_y
check_consistent_length(X, y)
File "C:\Users\Guilherme\Anaconda3\lib\site-packages\sklearn\utils\validation.py", line 181, in check_consistent_length
" samples: %r" % [int(l) for l in lengths])
**ValueError: Found input variables with inconsistent numbers of samples: [2, 974]**
我的输入是这样的:
residual [-0.62259993 6.71457698 8.14639867 ..., 3.17906001 -7.54454573
4.38012835]
approximation [ 59.91585806 61.33864282 64.3283577 ..., 43.82833793 44.61630255
46.4758499 ]
X [[ 59.91585806 61.33864282 64.3283577 ..., 43.82833793 44.61630255
46.4758499 ]
[ -0.62262456 6.71456765 8.14639445 ..., -4.5462162 -7.28155588
-18.45525218]]
这是我的目标:
Y [ -0.62274596 6.71452528 8.14634609 ..., -4.54623634 -7.28158029
-18.4552741 ]
这是我的密码
## ANN
print('residual',len(residual))
A = A[4:978]
Residual = residual[0:974]
X = np.array([A,Residual])
print(X.shape)
y = np.array(residual[0:974])
print(y)
clf = MLPClassifier(activation='relu', alpha=1e-05, batch_size='auto',
beta_1=0.9, beta_2=0.999, early_stopping=False,
epsilon=1e-08, hidden_layer_sizes=(15,), learning_rate='constant',
learning_rate_init=0.001, max_iter=200, momentum=0.9,
nesterovs_momentum=True, power_t=0.5, random_state=1, shuffle=True,
solver='lbfgs', tol=0.0001, validation_fraction=0.1, verbose=False,
warm_start=False)
clf.fit(X, y)
print(clf.fit(X, y))
顺便说一下,这是我的推荐信
坎德尔瓦尔,I.,阿迪卡里,R.,维尔马,G.(2015)。使用基于DWT分解的混合ARIMA和ANN模型进行时间序列预测。宝典计算机科学48(2015)173–179
[神经网络模型(监督)我无法理解你的X。请解释更多。错误是由于样本数量和目标数量不匹配。根据Skicit learn的MLP.fit,X将类似于X:{array like,sparse matrix},形状(n_样本,n_特征).因此,我输入ARIMA模型的残差作为n_特征,并将近似值(来自离散小波变换)作为n_样本)。这两个变量都有形状(974,)。同样,Y也有相同的形状。我尝试了转置Y,但什么都没有发生。