Machine learning 每个历元的LSTM拟合发生器步数
由于数据量大,我使用fit_生成器和自定义生成器来训练LSTM模型 我以前没有在fit_generator中使用LSTM,所以我不知道我的代码是否正确Machine learning 每个历元的LSTM拟合发生器步数,machine-learning,keras,lstm,Machine Learning,Keras,Lstm,由于数据量大,我使用fit_生成器和自定义生成器来训练LSTM模型 我以前没有在fit_generator中使用LSTM,所以我不知道我的代码是否正确 def generator_v2(trainDir,nb_classes,batch_size): print('start generator') classes = ["G11","G15","G17","G19","G32","G34","G48","G49"] while 1: print('loop generator')
def generator_v2(trainDir,nb_classes,batch_size):
print('start generator')
classes = ["G11","G15","G17","G19","G32","G34","G48","G49"]
while 1:
print('loop generator')
for root, subdirs, files in os.walk(trainDir):
for file in files:
try:
label = root.split("\\")[-1]
label = classes.index(label)
label = to_categorical(label,num_classes=nb_classes).reshape(1,nb_classes)
df = pd.read_csv(root +"\\"+ file)
batches = int(np.ceil(len(df) / batch_size))
for i in range(0, batches):
x_batch = df[i * batch_size:min(len(df), i * batch_size + batch_size)].values
x_batch = x_batch.reshape(1, x_batch.shape[0], x_batch.shape[1])
yield x_batch, label
del df
except EOFError:
print("error" + file)
trainDir = "data_diff_level2_statistics"
nb_classes = 8
batch_size = 128
MaxLen = 449 # each csv file has 449 rows,
batches = int(np.ceil(MaxLen / batch_size))
filesCount = sum([len(files) for r, d, files in os.walk(trainDir)]) # the number of all files
steps_per_epoch = batches*filesCount
model = Sequential()
model.add(LSTM(4,input_shape=(None,5)))
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adadelta',metrics=['acc'])
model.fit_generator(generator_v2(trainDir,nb_classes,batch_size),steps_per_epoch=steps_per_epoch, nb_epoch = 100, verbose=1)
我是否设置了每个历元的正确步数
我所有的训练数据形状是:(230449,5)
因此,我将每个历元的步长设置为230*(449/批次大小)
(449/批大小)表示我一次读取128行csv文件。参数
每个历元的步数
应等于样本总数(训练集的长度)除以批大小(同样适用于验证步数
)
在您的示例中,数据集的长度由dataset\u length=number\u of_csv\u files*length\u of_csv\u files
给出
因此,您的代码是正确的,因为您有230*(449/批大小),这与我上面写的内容类似。谢谢!我正在参加比赛,比赛很快就会到期,希望有人能立即回复我。谢谢!