Python 为什么我的keras LSTM模型陷入无限循环？_Python_Tensorflow_Keras_Neural Network_Lstm

Python 为什么我的keras LSTM模型陷入无限循环？

python tensorflow keras neural-network

Python 为什么我的keras LSTM模型陷入无限循环？,python,tensorflow,keras,neural-network,lstm,Python,Tensorflow,Keras,Neural Network,Lstm,我正在尝试构建一个小型LSTM，它可以通过在现有Python代码上进行训练来学习编写代码（即使是垃圾代码）。我将数千行代码连接在一个文件中，跨越数百个文件，每个文件以结尾，表示“序列结束” 例如，我的培训文件如下所示： setup(name='Keras', ... ], packages=find_packages()) <eos> import pyux ... with open('api.json', 'w') as f: json.dump

我正在尝试构建一个小型LSTM，它可以通过在现有Python代码上进行训练来学习编写代码（即使是垃圾代码）。我将数千行代码连接在一个文件中，跨越数百个文件，每个文件以

结尾，表示“序列结束”

例如，我的培训文件如下所示：


setup(name='Keras',
...
      ],
      packages=find_packages())
<eos>
import pyux
...
with open('api.json', 'w') as f:
    json.dump(sign, f)
<eos>

我的

keras

车型是：

model = Sequential()
model.add(Embedding(input_dim=len(self._words), output_dim=1024))

model.add(Bidirectional(
    LSTM(128), input_shape=(self.seq_length, len(self._words))))

model.add(Dropout(rate=0.5))
model.add(Dense(len(self._words)))
model.add(Activation('softmax'))

model.compile(loss='sparse_categorical_crossentropy',
              optimizer="adam", metrics=['accuracy'])

但无论我对它进行了多少训练，该模型似乎都不会生成

，甚至

\n

。我想这可能是因为我的LSTM大小是

，而我的

seq_长度是200，但这没有什么意义？有什么我遗漏的吗？
有时，当代码生成没有限制时

或

or标记不是数字标记

LSTM从不收敛。如果您可以发送输出或错误消息，那么调试就会容易得多

您可以创建一个额外的类来获取单词和句子

# tokens for start of sentence(SOS) and end of sentence(EOS)

SOS_token = 0
EOS_token = 1


class Lang:
    '''
    class for word object, storing sentences, words and word counts.
    '''
    def __init__(self, name):
        self.name = name
        self.word2index = {}
        self.word2count = {}
        self.index2word = {0: "SOS", 1: "EOS"}
        self.n_words = 2  # Count SOS and EOS

    def addSentence(self, sentence):
        for word in sentence.split(' '):
            self.addWord(word)

    def addWord(self, word):
        if word not in self.word2index:
            self.word2index[word] = self.n_words
            self.word2count[word] = 1
            self.index2word[self.n_words] = word
            self.n_words += 1
        else:
            self.word2count[word] += 1

然后，在生成文本时，只需添加

标记即可。

您可以使用字符级rnn作为参考。

您是否将单词转换为数字标记？您实际上是如何输入数据以适合您的模型的？我现在看到的只是定义当前和下一个序列，但这些是实际的标记本身

# tokens for start of sentence(SOS) and end of sentence(EOS)

SOS_token = 0
EOS_token = 1


class Lang:
    '''
    class for word object, storing sentences, words and word counts.
    '''
    def __init__(self, name):
        self.name = name
        self.word2index = {}
        self.word2count = {}
        self.index2word = {0: "SOS", 1: "EOS"}
        self.n_words = 2  # Count SOS and EOS

    def addSentence(self, sentence):
        for word in sentence.split(' '):
            self.addWord(word)

    def addWord(self, word):
        if word not in self.word2index:
            self.word2index[word] = self.n_words
            self.word2count[word] = 1
            self.index2word[self.n_words] = word
            self.n_words += 1
        else:
            self.word2count[word] += 1