预期ndim=3，发现ndim=4。在keras后端中使用K.function（）获取模型中的中间层时_Keras_Nlp_Text Classification_Keras Layer

预期ndim=3，发现ndim=4。在keras后端中使用K.function（）获取模型中的中间层时

keras nlp

预期ndim=3，发现ndim=4。在keras后端中使用K.function（）获取模型中的中间层时,keras,nlp,text-classification,keras-layer,Keras,Nlp,Text Classification,Keras Layer,我试图提取基于某些数据的分类模型的最后一层。第一层是嵌入层层，然后是bilstm，然后是输出密集层。下面是我的代码。我一直得到4d输出（1383000），而不是3d（138300）。1是样本量，38是句子的最大长度，300是word2vec长度 from keras import backend as K from tensorflow.keras.models import load_model import numpy as np import gensim word2vec = 'Goog

我试图提取基于某些数据的分类模型的最后一层。第一层是

嵌入层

层，然后是

bilstm

，然后是输出密集层。下面是我的代码。我一直得到4d输出

（1383000）

，而不是3d

（138300）

。1是样本量，38是句子的最大长度，300是word2vec长度

from keras import backend as K
from tensorflow.keras.models import load_model
import numpy as np
import gensim
word2vec = 'GoogleNews-vectors-negative300.txt'


x_matrix = np.zeros((1, 38, 300))
sentene_label = 'the weather today was extremely unpredictable,0'
parts = sentene_label.split(',')
label = int(parts[1])  
sentence = parts[0] 

words = sentence.split(' ')
words = words[:x_matrix.shape[1]]  
for j, word in enumerate(words):
    if word in word2vec:
        # x_matrix[0, j, :] = word2vec[word]
        x_matrix[0, j, :] = loaded_model.word_vec(word)


model = load_model('TrainedModel.h5')
get_3rd_layer_output = K.function([model.layers[0].input], [model.layers[2].output])  
layer_output = get_3rd_layer_output(x_matrix)[0]
print("Layer Output Shape 1 : ", layer_output.shape)

我已经反复检查了我的代码好几次，但我似乎不明白为什么维度是错误的

这是回溯

Traceback (most recent call last):
  File "/usr/pkg/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3427, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-bb840b495480>", line 1, in <module>
    runfile('/am/vuwstocoisnrin1.vuw.ac.nz/ecrg-solar/kosimadukwe/Data Augmentation/test.py', wdir='/am/vuwstocoisnrin1.vuw.ac.nz/ecrg-solar/kosimadukwe/Data Augmentation')
  File "/am/embassy/vol/x6/jetbrains/apps/PyCharm-P/ch-0/201.7846.77/plugins/python/helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "/am/embassy/vol/x6/jetbrains/apps/PyCharm-P/ch-0/201.7846.77/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/am/vuwstocoisnrin1.vuw.ac.nz/ecrg-solar/kosimadukwe/Data Augmentation/test.py", line 451, in <module>
    layer_output = get_3rd_layer_output(x_matrix)[0]
  File "/usr/pkg/lib/python3.8/site-packages/tensorflow/python/keras/backend.py", line 4073, in func
    outs = model(model_inputs)
  File "/usr/pkg/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line 1012, in __call__
    outputs = call_fn(inputs, *args, **kwargs)
  File "/usr/pkg/lib/python3.8/site-packages/tensorflow/python/keras/engine/functional.py", line 424, in call
    return self._run_internal_graph(
  File "/usr/pkg/lib/python3.8/site-packages/tensorflow/python/keras/engine/functional.py", line 560, in _run_internal_graph
    outputs = node.layer(*args, **kwargs)
  File "/usr/pkg/lib/python3.8/site-packages/tensorflow/python/keras/layers/wrappers.py", line 539, in __call__
    return super(Bidirectional, self).__call__(inputs, **kwargs)
  File "/usr/pkg/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line 998, in __call__
    input_spec.assert_input_compatibility(self.input_spec, inputs, self.name)
  File "/usr/pkg/lib/python3.8/site-packages/tensorflow/python/keras/engine/input_spec.py", line 219, in assert_input_compatibility
    raise ValueError('Input ' + str(input_index) + ' of layer ' +
ValueError: Input 0 of layer bidirectional_9 is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: (1, 38, 300, 300)

调用get_3rd_layer_output之前x_矩阵的形状为

The shape of X matrix : (60, 38, 300)

训练模型体系结构

model = Sequential()
model.add(Embedding(vocab_size, 300, input_length=38, weights=[embedding_matrix], trainable=True))
model.add(Bidirectional(LSTM(100, dropout=0.2)))
model.add(Dense(3, activation='sigmoid'))
model.compile(loss='sparse_categorical_crossentropy', optimizer='Adagrad', metrics=['accuracy'])
model.summary()

es = EarlyStopping(monitor='val_loss', mode='min', baseline=0.3, patience=100, verbose=1)
mc = ModelCheckpoint('TrainedModel.h5', monitor='val_loss', mode='min', verbose=1, save_best_only=True)
hist = model.fit(train_sequences, train_y, epochs=200, verbose=False, batch_size=100,validation_data=(val_sequences, val_y),callbacks=[es, mc])

TrainedModels模型。摘要为

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_9 (Embedding)      (None, 38, 300)           7370400   
_________________________________________________________________
bidirectional_9 (Bidirection (None, 200)               320800    
_________________________________________________________________
dense_9 (Dense)              (None, 3)                 603       
=================================================================
Total params: 7,691,803
Trainable params: 7,691,803
Non-trainable params: 0
_________________________________________________________________

获取任何中间层输出的正确方法是创建一个子模型，该子模型需要与您训练的模型相同的输入。在您的情况下，会出现错误，因为您将3D嵌入矩阵传递给经过训练的模型，而您必须传递用于训练的相同数据（带有整数编码字的2D数据）
在这里，我生成一个虚拟示例，以正确地从模型中提取任何中间输出
创建虚拟数据：

vocab_size = 111 emb_size = 300 input_length = 38 n_sample = 50 n_classes = 3 embedding_matrix = np.random.uniform(-1,1, (vocab_size, emb_size)) X = np.random.randint(0,vocab_size, (n_sample, input_length)) Y = np.random.randint(0,n_classes, (n_sample,))
创建模型和培训：

import tensorflow as tf from tensorflow.keras.layers import * from tensorflow.keras.models import * from tensorflow.keras import backend as K model = Sequential() model.add(Embedding(vocab_size, emb_size, input_length=input_length, weights=[embedding_matrix], trainable=True)) model.add(Bidirectional(LSTM(100, dropout=0.2))) model.add(Dense(n_classes, activation='sigmoid')) model.compile(loss='sparse_categorical_crossentropy', optimizer='Adagrad', metrics=['accuracy']) model.fit(X,Y, epochs=3) ### TRAINED WITH X
获取图层输出：

layer_id = 2 get_layer_output = K.function([model.layers[0].input], [model.layers[layer_id].output]) layer_output = get_layer_output(X)[0] ### EXTRACT FROM X # equal to: # sub_model = Model(model.input, model.layers[layer_id].output) # layer_output = sub_model.predict(X) ### EXTRACT FROM X

layer_id = 2 get_layer_output = K.function([model.layers[0].input], [model.layers[layer_id].output]) layer_output = get_layer_output(X)[0] ### EXTRACT FROM X # equal to: # sub_model = Model(model.input, model.layers[layer_id].output) # layer_output = sub_model.predict(X) ### EXTRACT FROM X