Python 3.x 保存并加载自定义Tensorflow模型(自回归seq2seq多元时间序列GRU/RNN)
我正在尝试实现一个自回归seq-2-seq RNN来预测时间序列数据。该模型由一个自定义模型类组成,该类继承自Python 3.x 保存并加载自定义Tensorflow模型(自回归seq2seq多元时间序列GRU/RNN),python-3.x,tensorflow,keras,deep-learning,recurrent-neural-network,Python 3.x,Tensorflow,Keras,Deep Learning,Recurrent Neural Network,我正在尝试实现一个自回归seq-2-seq RNN来预测时间序列数据。该模型由一个自定义模型类组成,该类继承自tf.keras.model,其代码如下所示。我已经将此模型用于时间序列预测,作为输入数据的是(15108)数据集(维度:(序列长度,输入单位)),作为输出数据的是(10108)数据集 虽然培训很成功,我还没有成功地保存和重新加载模型,以便在测试集中评估以前培训过的模型。我尝试在互联网上寻找解决方案,但迄今为止似乎没有一个有效。这可能是因为它是一个使用渴望执行训练的自定义模型,因为多个线
tf.keras.model
,其代码如下所示。我已经将此模型用于时间序列预测,作为输入数据的是(15108)数据集(维度:(序列长度,输入单位)),作为输出数据的是(10108)数据集
虽然培训很成功,我还没有成功地保存和重新加载模型,以便在测试集中评估以前培训过的模型。我尝试在互联网上寻找解决方案,但迄今为止似乎没有一个有效。这可能是因为它是一个使用渴望执行训练的自定义模型,因为多个线程无法解决在这些条件下保存模型的问题
有谁能给我一些解决这个问题的建议吗。非常感谢您的帮助,谢谢
Thusfar,我已经使用tf.keras.models.load\u model(filepath)
加载了模型,并尝试了以下保存选项。两个选项的代码如下所示:
- 使用
函数保存。但是,只返回了一个.ckpt.data-00000-of-00001和一个.ckpt.index文件(因此没有.meta或.pb文件),我无法打开它keras.callbacks.ModelCheckpoint
- 使用
功能保存并加载模型,导致以下错误:tf.saved_model.save
model = FeedBack(units=neurons, out_steps=output_len, num_features=108, act_dense=output_activation)
model.compile(loss=loss,optimizer=tf.optimizers.Adam(lr=lr), metrics=['mean_absolute_error', 'mean_absolute_percentage_error', keras.metrics.RootMeanSquaredError()])
cp_callback = keras.callbacks.ModelCheckpoint(filepath=checkpoint_path, save_best_only=True, verbose=0)
earlyStopping = keras.callbacks.EarlyStopping(monitor='val_loss', patience=6, verbose=0, min_delta=1e-9, mode='auto')
# OPTION 1: USE ModelCheckpoint
r = model.fit(x=train_x, y=train_y, batch_size=32, shuffle=False, epochs=1,validation_data = (test_x, test_y), callbacks=[earlyStopping, cp_callback], verbose=0)
# OPTION 2: USE tf.saved_model.save()
!mkdir -p saved_model
model.save('/content/drive/My Drive/Colab Notebooks/Master thesis/NN_data/saved_model/s-%s' % timestring)
tf.saved_model.save(model, '/content/drive/My Drive/Colab Notebooks/Master thesis/NN_data/saved_model/s-%s' % timestring)
class FeedBack(tf.keras.Model):
def __init__(self, units, out_steps, num_features, act_dense):
super().__init__()
self.out_steps = out_steps
self.units = units
self.num_features = num_features
self.act_dense = act_dense
self.gru_cell = tf.keras.layers.GRUCell(units)
# Also wrap the LSTMCell in an RNN to simplify the `warmup` method.
self.gru_rnn = tf.keras.layers.RNN(self.gru_cell, return_state=True)
self.dense = tf.keras.layers.Dense(num_features, activation=act_dense) #self.num_features?
def warmup(self, inputs):
# inputs.shape => (batch, time, features)
# x.shape => (batch, lstm_units)
x, state = self.gru_rnn(inputs)
# predictions.shape => (batch, features)
prediction = self.dense(x)
return prediction, state
def call(self, inputs, training=None):
# Use a TensorArray to capture dynamically unrolled outputs.
predictions = []
# Initialize the lstm state
prediction, state = self.warmup(inputs)
# Insert the first prediction
predictions.append(prediction)
# Run the rest of the prediction steps
for _ in range(1, self.out_steps):
# Use the last prediction as input.
x = prediction
# Execute one gru step.
x, state = self.gru_cell(x, states=state,
training=training)
# Convert the gru output to a prediction.
prediction = self.dense(x)
# Add the prediction to the output
predictions.append(prediction)
# predictions.shape => (time, batch, features)
predictions = tf.stack(predictions)
# predictions.shape => (batch, time, features)
predictions = tf.transpose(predictions, [1, 0, 2])
return predictions
我想说的是,问题出在您提供给ModelCheckpoint回调的文件路径上,它应该是一个hdf5文件 例如,在我的案例中:
ckpt_name='/work/../weights/{}.hdf5'.格式(日志名称)
...
回调=[
张量图(…),
tf.keras.callbacks.ModelCheckpoint(filepath=ckpt\u name)
]
...
model.fit(序列生成器,验证数据=验证生成器,验证频率=1,历元=标志['历元],
回调=回调)
想想看,问题的根源是在\uuuu init\uuuu
中,您将gru\u单元格
包装在一个层中。RNN
。这会导致相同的gru\u单元使用两次:一次在warmup()
中,然后再次在call()
中。对于培训,这不是一个问题,但正如您所注意到的,在保存模型时,它将失败
将自定义RNN层替换为层。GRU
更改此项:
def\uuuu init\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu
...
self.gru_cell=tf.keras.layers.GRUCell(单位)
#还将LSTMCell封装在RNN中,以简化“预热”方法。
self.gru_rnn=tf.keras.layers.rnn(self.gru_单元格,返回状态=True)
...
为此:
def\uuuu init\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu
...
self.gru_cell=tf.keras.layers.GRUCell(单位)
self.gru\u rnn=tf.keras.layers.gru(单位,返回状态=True)
...
(编辑)
注意:原始代码中的gru\u单元格和gru\u rnn
层不会共享其权重。从这个意义上讲,原始版本更可取,因为在整个序列上都有相同的GRUCell
操作
在我的版本中,layers.GRU
操作输入序列,之后状态将传递到layers.GRUCell
。这有一个缺点,就是层的权重。GRUCell
必须单独优化(学习)并且不能从使用与层相同的权重中获益。GRU
,反之亦然。嗨,超级星团,谢谢你的建议,它确实解决了我的问题。然而,由于GRU单元和GRU层都是单独定义的,因此要训练的参数数量也增加了一倍。gru_细胞和gru_rnn的重量是相同的,还是独立训练的?就性能而言,没有任何变化,但我很想知道引擎盖下会发生什么。谢谢你是对的。这是我没有想到的。GRU单元和GRU层不共享其权重。GRU单元和GRU层的权重将分别进行优化,这可能会导致不太好的预测(更大的误差),因为GRU单元不会受益于GRU层“已学习”的权重,而是GRU单元必须“重新学习”其自身的权重,反之亦然。
class FeedBack(tf.keras.Model):
def __init__(self, units, out_steps, num_features, act_dense):
super().__init__()
self.out_steps = out_steps
self.units = units
self.num_features = num_features
self.act_dense = act_dense
self.gru_cell = tf.keras.layers.GRUCell(units)
# Also wrap the LSTMCell in an RNN to simplify the `warmup` method.
self.gru_rnn = tf.keras.layers.RNN(self.gru_cell, return_state=True)
self.dense = tf.keras.layers.Dense(num_features, activation=act_dense) #self.num_features?
def warmup(self, inputs):
# inputs.shape => (batch, time, features)
# x.shape => (batch, lstm_units)
x, state = self.gru_rnn(inputs)
# predictions.shape => (batch, features)
prediction = self.dense(x)
return prediction, state
def call(self, inputs, training=None):
# Use a TensorArray to capture dynamically unrolled outputs.
predictions = []
# Initialize the lstm state
prediction, state = self.warmup(inputs)
# Insert the first prediction
predictions.append(prediction)
# Run the rest of the prediction steps
for _ in range(1, self.out_steps):
# Use the last prediction as input.
x = prediction
# Execute one gru step.
x, state = self.gru_cell(x, states=state,
training=training)
# Convert the gru output to a prediction.
prediction = self.dense(x)
# Add the prediction to the output
predictions.append(prediction)
# predictions.shape => (time, batch, features)
predictions = tf.stack(predictions)
# predictions.shape => (batch, time, features)
predictions = tf.transpose(predictions, [1, 0, 2])
return predictions