Python 使用保存的CNN模型从输入文本对单个评论进行预测_Python_Tensorflow_Keras_Nlp_Sentiment Analysis

Python 使用保存的CNN模型从输入文本对单个评论进行预测

python tensorflow keras nlp

Python 使用保存的CNN模型从输入文本对单个评论进行预测,python,tensorflow,keras,nlp,sentiment-analysis,Python,Tensorflow,Keras,Nlp,Sentiment Analysis,我正在制作一个基于Keras中CNN模型的分类器我将在应用程序中使用它，用户可以加载应用程序并输入文本，然后从权重加载模型并进行预测问题是我也使用手套嵌入，CNN模型也使用填充文本序列我使用Keras标记器如下： tokenizer = text.Tokenizer(num_words=max_features, lower=True, char_level=False) tokenizer.fit_on_texts(list(train_x)) train_x = tokenizer.

我正在制作一个基于Keras中CNN模型的分类器

我将在应用程序中使用它，用户可以加载应用程序并输入文本，然后从权重加载模型并进行预测

问题是我也使用手套嵌入，CNN模型也使用填充文本序列

我使用Keras标记器如下：

tokenizer = text.Tokenizer(num_words=max_features, lower=True, char_level=False)
tokenizer.fit_on_texts(list(train_x))

train_x = tokenizer.texts_to_sequences(train_x)
test_x = tokenizer.texts_to_sequences(test_x)

train_x = sequence.pad_sequences(train_x, maxlen=maxlen)
test_x = sequence.pad_sequences(test_x, maxlen=maxlen)

我对模型进行了训练，并根据测试数据进行了预测，但现在我想用我加载并工作的加载模型进行测试

但我这里的问题是，如果我提供一个单独的回顾，它必须通过

令牌服务器.text\u to\u sequences（）

来传递，它返回2D数组，形状为

（num\u chars，maxlength）

，然后是

num\u chars

预测，但我需要

（1，max\u length）

形状

我使用以下代码进行预测：

review = 'well free phone cingular broke stuck not abl offer kind deal number year contract up realli want razr so went look cheapest one could find so went came euro charger small adpat made fit american outlet, gillett fusion power replac cartridg number count packagemay not greatest valu out have agillett fusion power razor'
xtest = tokenizer.texts_to_sequences(review)
xtest = sequence.pad_sequences(xtest, maxlen=maxlen)

model.predict(xtest)

输出为：

array([[0.29289   , 0.36136267, 0.6205081 ],
       [0.362869  , 0.31441122, 0.539749  ],
       [0.32059124, 0.3231736 , 0.5552745 ],
       ...,
       [0.34428033, 0.3363668 , 0.57663095],
       [0.43134686, 0.33979046, 0.48991954],
       [0.22115968, 0.27314988, 0.6188136 ]], dtype=float32)

这里我需要一个单独的预测

数组（[0.29289,0.36136267,0.620581]）

，因为我有一个单独的回顾。

问题是您需要将字符串列表传递给

文本到序列。因此，您需要将单个评论放在如下列表中：
xtest = tokenizer.texts_to_sequences([review])

如果您不这样做（即传递字符串，而不是字符串列表），那么考虑Python中的字符串是可迭代的，超过给定字符串的字符，并考虑字符，而不是单词，作为令牌：
oov_token_index = self.word_index.get(self.oov_token)
for text in texts:  # <-- it would iterate over the string instead
    if self.char_level or isinstance(text, list):

oov\u token\u index=self.word\u index.get（self.oov\u token）
对于文本中的文本：#