&引用；已保存的“U型cli显示”；在keras上+；变压器模型显示用于培训的不同输入和形状_Keras_Google Cloud Platform_Tensorflow2.0_Huggingface Transformers

&引用；已保存的“U型cli显示”；在keras上+；变压器模型显示用于培训的不同输入和形状

keras google-cloud-platform

&引用；已保存的“U型cli显示”；在keras上+；变压器模型显示用于培训的不同输入和形状,keras,google-cloud-platform,tensorflow2.0,huggingface-transformers,Keras,Google Cloud Platform,Tensorflow2.0,Huggingface Transformers,我正在使用transformersTFBertForSequenceClassification.from_pretrained与'bert base multi-language uncased'）和keras构建我的模型 loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True) # metric metric = tf.keras.metrics.SparseCategoricalAccuracy('acc

我正在使用

transformers

TFBertForSequenceClassification.from_pretrained

与'bert base multi-language uncased'）和

keras

构建我的模型

loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

# metric
metric = tf.keras.metrics.SparseCategoricalAccuracy('accuracy')

# optimizer
optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate, epsilon=epsilon)

# create and compile the Keras model in the context of strategy.scope
model = TFBertForSequenceClassification.from_pretrained(pretrained_weights,
                                                        num_labels=num_labels,
                                                        cache_dir=pretrained_model_dir)
model._name = 'tf_bert_classification'

# compile Keras model
model.compile(optimizer=optimizer,
              loss=loss,
              metrics=[metric])

我正在使用

SST2

数据，这些数据是标记化的，并且是模型的训练提要。数据具有以下形状：

    shape: (32,)
    dict structure
       dim: 3
       [input_ids       / attention_mask  / token_type_ids ]
       [(32, 128)       / (32, 128)       / (32, 128)      ]
       [ndarray         / ndarray         / ndarray        ]

这里有一个例子：

({'input_ids': <tf.Tensor: shape=(32, 128), dtype=int32, numpy=
array([[  101, 21270, 94696, ...,     0,     0,     0],
       [  101,   143, 45100, ...,     0,     0,     0],
       [  101, 24220,   102, ...,     0,     0,     0],
       ...,
       [  101, 11008, 10346, ...,     0,     0,     0],
       [  101, 43062, 15648, ...,     0,     0,     0],
       [  101, 13178, 18418, ...,     0,     0,     0]], dtype=int32)>, 'attention_mask': ....

正如我们所看到的，我们只需要给出

输入id

，而不需要（

注意掩码

和

标记类型id

），sape是不同的。虽然未定义预期的批大小（-1），但最大长度为5，而不是128！它在两个月前就开始工作了，我可能会介绍一些导致这个问题的东西

我尝试了一些版本的

Tensorfow

（

2.2.0

和

2.3.0

）和变压器（

2.8.0

，

2.9.0

和

3.0.2

）。我看不到Keras的模型输入和输出形状（无）：

你知道什么可以解释保存的模型需要一个不同的输入，一个用于训练！我可以使用Keras函数API并定义输入形状，但我非常确定这段代码以前是有效的。

我见过这样的行为，当模型从预训练的模型实例化，然后加载权重，然后才以完全保证的Keras格式保存。

当我随后加载后者时，它无法发出正确的预测，因为它的签名变成了垃圾：注意屏蔽消失，序列长度改变，虚拟无输入不知从何而来。因此，如果您是这样的话，可能会在拟合后立即尝试以keras格式保存您的模型（无需从权重中间加载）。

我见过这样的行为，即从预训练的模型实例化模型，然后加载权重，然后才以完全保证的keras格式保存。当我随后加载后者时，它无法发出正确的预测，因为它的签名变成了垃圾：注意屏蔽消失，序列长度改变，虚拟无输入不知从何而来。因此，如果您的情况是这样的，可能会在拟合后立即尝试以keras格式保存您的模型（无需从权重中进行中间加载）

saved_model_cli show --dir $MODEL_LOCAL --tag_set serve --signature_def serving_default

The given SavedModel SignatureDef contains the following input(s):
  inputs['input_ids'] tensor_info:
      dtype: DT_INT32
      shape: (-1, 5)
      name: serving_default_input_ids:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['output_1'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1, 2)
      name: StatefulPartitionedCall:0
Method name is: tensorflow/serving/predict

model.inputs


model.outputs


model.summary()

Model: "tf_bert_classification"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
bert (TFBertMainLayer)       multiple                  167356416 
_________________________________________________________________
dropout_37 (Dropout)         multiple                  0         
_________________________________________________________________
classifier (Dense)           multiple                  1538      
=================================================================
Total params: 167,357,954
Trainable params: 167,357,954
Non-trainable params: 0