python LSTM数组的索引太多

python LSTM数组的索引太多,python,tensorflow,neural-network,anaconda,lstm,Python,Tensorflow,Neural Network,Anaconda,Lstm,本准则的目的是预测一般货币外汇市场的未来价值 首先,我制作了一个统一的数据框架,这个统一的数据框架是11个外汇市场数据集(交易最广泛的货币)加上一组900个经济指标的组合 在将这些911数据集合并到统一的数据框架中之后,经过测试,没有出现任何问题,我开始使用LSTM神经网络,我也只使用单个数据集进行了测试,效果非常好 当我将统一数据框架与LSTM神经网络相结合时,问题就开始了 代码如下: import matplotlib.pyplot as plt import numpy as np imp

本准则的目的是预测一般货币外汇市场的未来价值

首先,我制作了一个统一的数据框架,这个统一的数据框架是11个外汇市场数据集(交易最广泛的货币)加上一组900个经济指标的组合

在将这些911数据集合并到统一的数据框架中之后,经过测试,没有出现任何问题,我开始使用LSTM神经网络,我也只使用单个数据集进行了测试,效果非常好

当我将统一数据框架与LSTM神经网络相结合时,问题就开始了

代码如下:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from keras.layers.core import Dense, Activation, Dropout
from keras.layers.recurrent import LSTM
from keras.models import Sequential
import os

os.chdir("E:\Business\Stocks")
path = os.listdir("E:\Business\Stocks")
for file in path:
    name, ext = os.path.splitext(str(file))
    column_names = ['Date', 'Time', 'Open', 'High', 'Low', 'Close', 'Volume']
    df1 = pd.read_csv(file, names=column_names, parse_dates={'DateTime': ['Date', 'Time']}, index_col=[0])
    df1 = df1.rename(columns={'Open': name + ' ' + 'Open', 'High': name + ' ' + 'High',
                              'Low': name + ' ' + 'Low', 'Close': name + ' ' + 'Close',
                              'Volume': name + ' ' + 'Volume'})


os.chdir("E:\Business\Economic Indicators")
path = os.listdir("E:\Business\Economic Indicators")
for file in path:
    df2 = pd.read_csv(file, index_col=[0], parse_dates=[0])
    name, ext1 = os.path.splitext(file)
    df2 = df2.rename(columns={'Actual': name + ' ' + 'Actual', 'Consensus': name + ' ' + 'Consensus',
                              'Previous': name + ' ' + 'Previous', 'Revised': name + ' ' + 'Revised'})


dfs = [df1 ,df2]
df = pd.concat(dfs, axis=1, join='inner').sort_index(ascending=False)
df.fillna(method='ffill', inplace=True)

sequence_length = 120
n_features = len(df.columns)
val_ratio = 0.1
n_epochs = 3000
batch_size = 500

data = df.as_matrix()
data_processed = []
for index in range(len(data) - sequence_length):
    data_processed.append(data[index: index + sequence_length])
data_processed = np.array(data_processed)

val_split = round((1 - val_ratio) * data_processed.shape[0])
train = data_processed[: int(val_split), :]
val = data_processed[int(val_split):, :]

print('Training data: {}'.format(train.shape))
print('Validation data: {}'.format(val.shape))

train_samples, train_nx, train_ny = train.shape
val_samples, val_nx, val_ny = val.shape

train = train.reshape((train_samples, train_nx * train_ny))
val = val.reshape((val_samples, val_nx * val_ny))

preprocessor = MinMaxScaler().fit(train)
train = preprocessor.transform(train)
val = preprocessor.transform(val)

train = train.reshape((train_samples, train_nx, train_ny))
val = val.reshape((val_samples, val_nx, val_ny))

X_train = train[:, : -1]
y_train = train[:, -1][:, -1]
X_val = val[:, : -1]
y_val = val[:, -1][:, -1]

X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], n_features))
X_val = np.reshape(X_val, (X_val.shape[0], X_val.shape[1], n_features))

model = Sequential()
model.add(LSTM(input_shape=(X_train.shape[1:]), units=100, return_sequences=True))
model.add(Dropout(0.5))
model.add(LSTM(100, return_sequences=False))
model.add(Dropout(0.25))
model.add(Dense(units=1))
model.add(Activation("relu"))

model.compile(loss='mean_squared_error', optimizer='adam', metrics=['mse', 'accuracy'])

history = model.fit(
    X_train,
    y_train,
    batch_size=batch_size,
    epochs=n_epochs,
    verbose=2)

preds_val = model.predict(X_val)
diff = []
for i in range(len(y_val)):
    pred = preds_val[i][0]
    diff.append(y_val[i] - pred)

real_min = preprocessor.data_min_[104]
real_max = preprocessor.data_max_[104]
print(preprocessor.data_min_[:1])
print(preprocessor.data_max_[:1])

preds_real = preds_val * (real_max - real_min) + real_min
y_val_real = y_val * (real_max - real_min) + real_min

plt.plot(preds_real, label='Predictions')
plt.plot(y_val_real, label='Actual values')
plt.xlabel('test')
plt.legend(loc=0)
plt.show()
print(model.summary())
以下是错误:

使用TensorFlow后端

回溯(最近一次呼叫最后一次):

文件“E:/Tutorial/new.py”,第47行,在train=data_processed[:int(val_split),:]


索引器错误:数组的索引太多

您没有正确地对
数据进行切片处理。我认为
int(val\u split)
之间有多余的空间。我不确定您的意图是什么,但应该涵盖可能的情况。

谢谢您的回答,但编辑建议或链接线索都无法解决此问题。错误消息非常简单:
numpy
抱怨,因为您传递了太多的索引,也就是说,您在数组为2D时传递了3个索引,当数组为1D等时,您正在传递两个索引。您能调试并告诉处理的
数据是什么样子的吗?它是空的ndarray…是什么形状的?因此您有一个
None
而不是
ndarray
,这就是您的问题所在