预期ndim=3,发现ndim=4。在keras后端中使用K.function()获取模型中的中间层时

预期ndim=3,发现ndim=4。在keras后端中使用K.function()获取模型中的中间层时,keras,nlp,text-classification,keras-layer,Keras,Nlp,Text Classification,Keras Layer,我试图提取基于某些数据的分类模型的最后一层。第一层是嵌入层层,然后是bilstm,然后是输出密集层。下面是我的代码。我一直得到4d输出(1383000),而不是3d(138300)。1是样本量,38是句子的最大长度,300是word2vec长度 from keras import backend as K from tensorflow.keras.models import load_model import numpy as np import gensim word2vec = 'Goog

我试图提取基于某些数据的分类模型的最后一层。第一层是
嵌入层
层,然后是
bilstm
,然后是输出密集层。下面是我的代码。我一直得到4d输出
(1383000)
,而不是3d
(138300)
。1是样本量,38是句子的最大长度,300是word2vec长度

from keras import backend as K
from tensorflow.keras.models import load_model
import numpy as np
import gensim
word2vec = 'GoogleNews-vectors-negative300.txt'


x_matrix = np.zeros((1, 38, 300))
sentene_label = 'the weather today was extremely unpredictable,0'
parts = sentene_label.split(',')
label = int(parts[1])  
sentence = parts[0] 

words = sentence.split(' ')
words = words[:x_matrix.shape[1]]  
for j, word in enumerate(words):
    if word in word2vec:
        # x_matrix[0, j, :] = word2vec[word]
        x_matrix[0, j, :] = loaded_model.word_vec(word)


model = load_model('TrainedModel.h5')
get_3rd_layer_output = K.function([model.layers[0].input], [model.layers[2].output])  
layer_output = get_3rd_layer_output(x_matrix)[0]
print("Layer Output Shape 1 : ", layer_output.shape)
我已经反复检查了我的代码好几次,但我似乎不明白为什么维度是错误的

这是回溯

Traceback (most recent call last):
  File "/usr/pkg/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3427, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-bb840b495480>", line 1, in <module>
    runfile('/am/vuwstocoisnrin1.vuw.ac.nz/ecrg-solar/kosimadukwe/Data Augmentation/test.py', wdir='/am/vuwstocoisnrin1.vuw.ac.nz/ecrg-solar/kosimadukwe/Data Augmentation')
  File "/am/embassy/vol/x6/jetbrains/apps/PyCharm-P/ch-0/201.7846.77/plugins/python/helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "/am/embassy/vol/x6/jetbrains/apps/PyCharm-P/ch-0/201.7846.77/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/am/vuwstocoisnrin1.vuw.ac.nz/ecrg-solar/kosimadukwe/Data Augmentation/test.py", line 451, in <module>
    layer_output = get_3rd_layer_output(x_matrix)[0]
  File "/usr/pkg/lib/python3.8/site-packages/tensorflow/python/keras/backend.py", line 4073, in func
    outs = model(model_inputs)
  File "/usr/pkg/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line 1012, in __call__
    outputs = call_fn(inputs, *args, **kwargs)
  File "/usr/pkg/lib/python3.8/site-packages/tensorflow/python/keras/engine/functional.py", line 424, in call
    return self._run_internal_graph(
  File "/usr/pkg/lib/python3.8/site-packages/tensorflow/python/keras/engine/functional.py", line 560, in _run_internal_graph
    outputs = node.layer(*args, **kwargs)
  File "/usr/pkg/lib/python3.8/site-packages/tensorflow/python/keras/layers/wrappers.py", line 539, in __call__
    return super(Bidirectional, self).__call__(inputs, **kwargs)
  File "/usr/pkg/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line 998, in __call__
    input_spec.assert_input_compatibility(self.input_spec, inputs, self.name)
  File "/usr/pkg/lib/python3.8/site-packages/tensorflow/python/keras/engine/input_spec.py", line 219, in assert_input_compatibility
    raise ValueError('Input ' + str(input_index) + ' of layer ' +
ValueError: Input 0 of layer bidirectional_9 is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: (1, 38, 300, 300)
调用get_3rd_layer_output之前x_矩阵的形状为

The shape of X matrix : (60, 38, 300)

训练模型体系结构

model = Sequential()
model.add(Embedding(vocab_size, 300, input_length=38, weights=[embedding_matrix], trainable=True))
model.add(Bidirectional(LSTM(100, dropout=0.2)))
model.add(Dense(3, activation='sigmoid'))
model.compile(loss='sparse_categorical_crossentropy', optimizer='Adagrad', metrics=['accuracy'])
model.summary()

es = EarlyStopping(monitor='val_loss', mode='min', baseline=0.3, patience=100, verbose=1)
mc = ModelCheckpoint('TrainedModel.h5', monitor='val_loss', mode='min', verbose=1, save_best_only=True)
hist = model.fit(train_sequences, train_y, epochs=200, verbose=False, batch_size=100,validation_data=(val_sequences, val_y),callbacks=[es, mc]) 

TrainedModels模型。摘要为

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_9 (Embedding)      (None, 38, 300)           7370400   
_________________________________________________________________
bidirectional_9 (Bidirection (None, 200)               320800    
_________________________________________________________________
dense_9 (Dense)              (None, 3)                 603       
=================================================================
Total params: 7,691,803
Trainable params: 7,691,803
Non-trainable params: 0
_________________________________________________________________


获取任何中间层输出的正确方法是创建一个子模型,该子模型需要与您训练的模型相同的输入。在您的情况下,会出现错误,因为您将3D嵌入矩阵传递给经过训练的模型,而您必须传递用于训练的相同数据(带有整数编码字的2D数据)

在这里,我生成一个虚拟示例,以正确地从模型中提取任何中间输出

创建虚拟数据:

vocab_size = 111
emb_size = 300
input_length = 38
n_sample = 50
n_classes = 3

embedding_matrix = np.random.uniform(-1,1, (vocab_size, emb_size))
X = np.random.randint(0,vocab_size, (n_sample, input_length))
Y = np.random.randint(0,n_classes, (n_sample,))
创建模型和培训:

import tensorflow as tf
from tensorflow.keras.layers import *
from tensorflow.keras.models import *
from tensorflow.keras import backend as K

model = Sequential()
model.add(Embedding(vocab_size, emb_size, input_length=input_length, 
                    weights=[embedding_matrix], trainable=True))
model.add(Bidirectional(LSTM(100, dropout=0.2)))
model.add(Dense(n_classes, activation='sigmoid'))
model.compile(loss='sparse_categorical_crossentropy', 
              optimizer='Adagrad', metrics=['accuracy'])
model.fit(X,Y, epochs=3)  ### TRAINED WITH X
获取图层输出:

layer_id = 2
get_layer_output = K.function([model.layers[0].input], [model.layers[layer_id].output])
layer_output = get_layer_output(X)[0]  ### EXTRACT FROM X
# equal to:
# sub_model = Model(model.input, model.layers[layer_id].output)
# layer_output = sub_model.predict(X)  ### EXTRACT FROM X
layer_id = 2
get_layer_output = K.function([model.layers[0].input], [model.layers[layer_id].output])
layer_output = get_layer_output(X)[0]  ### EXTRACT FROM X
# equal to:
# sub_model = Model(model.input, model.layers[layer_id].output)
# layer_output = sub_model.predict(X)  ### EXTRACT FROM X