Python 3.x 保存并加载自定义Tensorflow模型（自回归seq2seq多元时间序列GRU/RNN）_Python 3.x_Tensorflow_Keras_Deep Learning_Recurrent Neural Network

Python 3.x 保存并加载自定义Tensorflow模型（自回归seq2seq多元时间序列GRU/RNN）

python-3.x tensorflow keras deep-learning

Python 3.x 保存并加载自定义Tensorflow模型（自回归seq2seq多元时间序列GRU/RNN）,python-3.x,tensorflow,keras,deep-learning,recurrent-neural-network,Python 3.x,Tensorflow,Keras,Deep Learning,Recurrent Neural Network,我正在尝试实现一个自回归seq-2-seq RNN来预测时间序列数据。该模型由一个自定义模型类组成，该类继承自tf.keras.model，其代码如下所示。我已经将此模型用于时间序列预测，作为输入数据的是（15108）数据集（维度：（序列长度，输入单位）），作为输出数据的是（10108）数据集虽然培训很成功，我还没有成功地保存和重新加载模型，以便在测试集中评估以前培训过的模型。我尝试在互联网上寻找解决方案，但迄今为止似乎没有一个有效。这可能是因为它是一个使用渴望执行训练的自定义模型，因为多个线

我正在尝试实现一个自回归seq-2-seq RNN来预测时间序列数据。该模型由一个自定义模型类组成，该类继承自

tf.keras.model

，其代码如下所示。我已经将此模型用于时间序列预测，作为输入数据的是（15108）数据集（维度：（序列长度，输入单位）），作为输出数据的是（10108）数据集

虽然培训很成功，我还没有成功地保存和重新加载模型，以便在测试集中评估以前培训过的模型。我尝试在互联网上寻找解决方案，但迄今为止似乎没有一个有效。这可能是因为它是一个使用渴望执行训练的自定义模型，因为多个线程无法解决在这些条件下保存模型的问题

有谁能给我一些解决这个问题的建议吗。非常感谢您的帮助，谢谢

Thusfar，我已经使用

tf.keras.models.load\u model（filepath）

加载了模型，并尝试了以下保存选项。两个选项的代码如下所示：

使用
```
keras.callbacks.ModelCheckpoint
```
函数保存。但是，只返回了一个.ckpt.data-00000-of-00001和一个.ckpt.index文件（因此没有.meta或.pb文件），我无法打开它
使用
```
tf.saved_model.save
```
功能保存并加载模型，导致以下错误：

这是构建模型时使用的代码：


    model = FeedBack(units=neurons, out_steps=output_len, num_features=108, act_dense=output_activation)
      
    model.compile(loss=loss,optimizer=tf.optimizers.Adam(lr=lr), metrics=['mean_absolute_error', 'mean_absolute_percentage_error', keras.metrics.RootMeanSquaredError()])
    
    cp_callback = keras.callbacks.ModelCheckpoint(filepath=checkpoint_path, save_best_only=True, verbose=0)
    earlyStopping = keras.callbacks.EarlyStopping(monitor='val_loss', patience=6, verbose=0,  min_delta=1e-9, mode='auto')
    
    # OPTION 1: USE ModelCheckpoint
    r = model.fit(x=train_x, y=train_y, batch_size=32, shuffle=False, epochs=1,validation_data = (test_x, test_y), callbacks=[earlyStopping, cp_callback], verbose=0)
        
    # OPTION 2: USE tf.saved_model.save()
    !mkdir -p saved_model
    model.save('/content/drive/My Drive/Colab Notebooks/Master thesis/NN_data/saved_model/s-%s' % timestring)
    tf.saved_model.save(model, '/content/drive/My Drive/Colab Notebooks/Master thesis/NN_data/saved_model/s-%s' % timestring)


    class FeedBack(tf.keras.Model):
        def __init__(self, units, out_steps, num_features, act_dense):
            super().__init__()
            self.out_steps = out_steps
            self.units = units
            self.num_features = num_features
            self.act_dense = act_dense
            self.gru_cell = tf.keras.layers.GRUCell(units)
            # Also wrap the LSTMCell in an RNN to simplify the `warmup` method.
            self.gru_rnn = tf.keras.layers.RNN(self.gru_cell, return_state=True)
            self.dense = tf.keras.layers.Dense(num_features, activation=act_dense) #self.num_features?
    
        def warmup(self, inputs):
            # inputs.shape => (batch, time, features)
            # x.shape => (batch, lstm_units)
            x, state = self.gru_rnn(inputs)
            
            # predictions.shape => (batch, features)
            prediction = self.dense(x)
            return prediction, state
    
        def call(self, inputs, training=None):
            # Use a TensorArray to capture dynamically unrolled outputs.
            predictions = []
            # Initialize the lstm state
            prediction, state = self.warmup(inputs)
    
            # Insert the first prediction
            predictions.append(prediction)
    
            # Run the rest of the prediction steps
            for _ in range(1, self.out_steps):
                # Use the last prediction as input.
                x = prediction
                # Execute one gru step.
                x, state = self.gru_cell(x, states=state,
                                                                    training=training)
                # Convert the gru output to a prediction.
                prediction = self.dense(x)
                # Add the prediction to the output
                predictions.append(prediction)
    
            # predictions.shape => (time, batch, features)
            predictions = tf.stack(predictions)
            # predictions.shape => (batch, time, features)
            predictions = tf.transpose(predictions, [1, 0, 2])
            return predictions

我想说的是，问题出在您提供给ModelCheckpoint回调的文件路径上，它应该是一个hdf5文件

例如，在我的案例中：


ckpt_name='/work/../weights/{}.hdf5'.格式（日志名称）
...
回调=[
张量图（…），
tf.keras.callbacks.ModelCheckpoint（filepath=ckpt\u name）
]
...
model.fit（序列生成器，验证数据=验证生成器，验证频率=1，历元=标志['历元]，
回调=回调）

想想看，问题的根源是在

\uuuu init\uuuu

中，您将

gru\u单元格

包装在一个

层中。RNN

。这会导致相同的

gru\u单元使用两次：一次在warmup（）
中，然后再次在call（）
中。对于培训，这不是一个问题，但正如您所注意到的，在保存模型时，它将失败
将自定义RNN层替换为层。GRU

更改此项：
def\uuuu init\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu
...
self.gru_cell=tf.keras.layers.GRUCell（单位）
#还将LSTMCell封装在RNN中，以简化“预热”方法。
self.gru_rnn=tf.keras.layers.rnn（self.gru_单元格，返回状态=True）
...

为此：
def\uuuu init\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu
...
self.gru_cell=tf.keras.layers.GRUCell（单位）
self.gru\u rnn=tf.keras.layers.gru（单位，返回状态=True）
...

（编辑）

注意：原始代码中的gru\u单元格和gru\u rnn
层不会共享其权重。从这个意义上讲，原始版本更可取，因为在整个序列上都有相同的GRUCell
操作
在我的版本中，layers.GRU
操作输入序列，之后状态将传递到layers.GRUCell
。这有一个缺点，就是层的权重。GRUCell
必须单独优化（学习）并且不能从使用与层相同的权重中获益。GRU
，反之亦然。嗨，超级星团，谢谢你的建议，它确实解决了我的问题。然而，由于GRU单元和GRU层都是单独定义的，因此要训练的参数数量也增加了一倍。gru_细胞和gru_rnn的重量是相同的，还是独立训练的？就性能而言，没有任何变化，但我很想知道引擎盖下会发生什么。谢谢你是对的。这是我没有想到的。GRU单元和GRU层不共享其权重。GRU单元和GRU层的权重将分别进行优化，这可能会导致不太好的预测（更大的误差），因为GRU单元不会受益于GRU层“已学习”的权重，而是GRU单元必须“重新学习”其自身的权重，反之亦然。

    class FeedBack(tf.keras.Model):
        def __init__(self, units, out_steps, num_features, act_dense):
            super().__init__()
            self.out_steps = out_steps
            self.units = units
            self.num_features = num_features
            self.act_dense = act_dense
            self.gru_cell = tf.keras.layers.GRUCell(units)
            # Also wrap the LSTMCell in an RNN to simplify the `warmup` method.
            self.gru_rnn = tf.keras.layers.RNN(self.gru_cell, return_state=True)
            self.dense = tf.keras.layers.Dense(num_features, activation=act_dense) #self.num_features?
    
        def warmup(self, inputs):
            # inputs.shape => (batch, time, features)
            # x.shape => (batch, lstm_units)
            x, state = self.gru_rnn(inputs)
            
            # predictions.shape => (batch, features)
            prediction = self.dense(x)
            return prediction, state
    
        def call(self, inputs, training=None):
            # Use a TensorArray to capture dynamically unrolled outputs.
            predictions = []
            # Initialize the lstm state
            prediction, state = self.warmup(inputs)
    
            # Insert the first prediction
            predictions.append(prediction)
    
            # Run the rest of the prediction steps
            for _ in range(1, self.out_steps):
                # Use the last prediction as input.
                x = prediction
                # Execute one gru step.
                x, state = self.gru_cell(x, states=state,
                                                                    training=training)
                # Convert the gru output to a prediction.
                prediction = self.dense(x)
                # Add the prediction to the output
                predictions.append(prediction)
    
            # predictions.shape => (time, batch, features)
            predictions = tf.stack(predictions)
            # predictions.shape => (batch, time, features)
            predictions = tf.transpose(predictions, [1, 0, 2])
            return predictions