Tensorflow 尝试在每次迭代失败后保存Keras模型
我没有使用sklearn的RandomSearch(我必须等待整个过程完成才能看到最佳结果),而是尝试提前创建所有超参数组合,并逐个运行,这样我就可以随时停止,在服务器无事可做时继续 这是我的代码: 首先,这是创建模型的函数:Tensorflow 尝试在每次迭代失败后保存Keras模型,tensorflow,machine-learning,keras,neural-network,Tensorflow,Machine Learning,Keras,Neural Network,我没有使用sklearn的RandomSearch(我必须等待整个过程完成才能看到最佳结果),而是尝试提前创建所有超参数组合,并逐个运行,这样我就可以随时停止,在服务器无事可做时继续 这是我的代码: 首先,这是创建模型的函数: def create_model(neurons=2000, activation1='tanh', dropout_rate=0.0, activation2='sigmoid'): model = Sequential() model.add(Dens
def create_model(neurons=2000, activation1='tanh', dropout_rate=0.0, activation2='sigmoid'):
model = Sequential()
model.add(Dense(neurons, input_dim=10000, activation=activation1))
model.add(Dropout(dropout_rate))
# model.add(Dense(390, activation='relu'))
model.add(Dense(61, activation=activation2))
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam')
return model
model = KerasRegressor(build_fn=create_model, verbose=4)
其次,我正在创建所有超参数组合
它们将存储在“params”变量中:
batch_size = np.arange(1, 400)
nb_epoch = np.arange(100, 400)
activation1 = ['relu', 'tanh', 'sigmoid']
activation2 = ['sigmoid']
dropout_rate = np.arange(0, 0.2, 0.01)
neurons = np.arange(250, 5000)
param_distributions = dict(batch_size=batch_size, nb_epoch=nb_epoch, activation1=activation1, \
activation2=activation2, dropout_rate=dropout_rate, neurons=neurons)
grid = RandomizedSearchCV(estimator=model, param_distributions=param_distributions, n_iter=1000, n_jobs=-1,
random_state=42)
x = grid._get_param_iterator()
params = list(x)
最后,我在运行模型
每次使用params中的当前下一个hyperparameters配置
for p in params:
model = KerasRegressor(build_fn=create_model, verbose=4)
batch_size = [p['batch_size']]
nb_epoch = [p['nb_epoch']]
activation1 = [p['activation1']]
activation2 = [p['activation2']]
dropout_rate = [p['dropout_rate']]
neurons = [p['neurons']]
curr_param_distributions = dict(batch_size=batch_size, nb_epoch=nb_epoch, activation1=activation1, \
activation2=activation2, dropout_rate=dropout_rate, neurons=neurons)
curr_grid = RandomizedSearchCV(estimator=model, param_distributions=curr_param_distributions, n_iter=1, n_jobs=-1)
curr_grid_result = curr_grid.fit(X, y)
curr_score = curr_grid.best_score_
curr_grid.best_estimator_.model.save(
'/data/models/model_{}_score_{}.h5'.format(str(curr_param_distributions), curr_score))
del model
del curr_grid
del curr_grid_result
K.clear_session()
第一次迭代运行时没有问题。
第二次迭代在第一个历元之后卡住(即,它正在打印1/257 1/257 1/257(为了示例,如果nb_epoch=257)),然后它就停止了
为什么会这样?
任何帮助都将不胜感激