Machine learning RNN未接受培训(Pytork)

Machine learning RNN未接受培训(Pytork),machine-learning,deep-learning,pytorch,rnn,Machine Learning,Deep Learning,Pytorch,Rnn,我无法理解我在训练时做错了什么。我正在尝试对RNN进行和序列操作培训(以了解它如何在简单任务中工作)。 但我的人际网络没有学习,损失保持不变,不能超出模型。 你能帮我找到问题吗 我使用的数据: data = [ [1, 1, 1, 1, 0, 0, 1, 1, 1], [1, 1, 1, 1], [0, 0, 1, 1], [0, 0, 0, 0, 0, 0, 0], [1, 1, 1, 1, 1, 1, 1], [1, 1], [0],

我无法理解我在训练时做错了什么。我正在尝试对RNN进行序列操作培训(以了解它如何在简单任务中工作)。 但我的人际网络没有学习,损失保持不变,不能超出模型。 你能帮我找到问题吗

我使用的数据:

data = [
    [1, 1, 1, 1, 0, 0, 1, 1, 1],
    [1, 1, 1, 1],
    [0, 0, 1, 1],
    [0, 0, 0, 0, 0, 0, 0],
    [1, 1, 1, 1, 1, 1, 1],
    [1, 1],
    [0],
    [1],
    [1, 0]]
labels = [
    0,
    1, 
    0, 
    0,
    1,
    1,
    0,
    1,
    0
]
NN的代码:

class AndRNN(nn.Module):
def __init__(self):
    super(AndRNN, self).__init__()
    self.rnn = nn.RNN(1, 10, 5)
    self.fc = nn.Sequential(
        nn.Linear(10, 30),
        nn.Linear(30, 2)
    )

def forward(self, input, hidden):
    x, hidden = self.rnn(input, hidden)
    x = self.fc(x[-1])        
    return x, hidden

def initHidden(self):
    return Variable(torch.zeros((5, 1, 10)))
训练循环:

criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

correct = 0
for e in range(20):
for i in range(len(data)):
    tensor = torch.FloatTensor(data[i]).view(-1, 1, 1)
    label = torch.LongTensor([labels[i]])
    hidden = net.initHidden()
    optimizer.zero_grad()

    out, hidden = net(Variable(tensor), Variable(hidden.data))

    _, l = torch.topk(out, 1)
    if label[0] == l[0].data[0]:
        correct += 1

    loss = criterion(out, Variable(label))
    loss.backward()
    optimizer.step()

    print("Loss ", loss.data[0], "Accuracy ", (correct / (i + 1)))

张量的形状将是(序列长度,1(即批量大小),1),根据PyTorch文档中的RNN,这是正确的。问题是这一行:

out, hidden = net(Variable(tensor), Variable(hidden.data))
它应该是简单的

out, hidden = net(Variable(tensor), hidden)
通过使用
变量(hidden.data)
在这里,您可以在每一步创建一个新的隐藏状态变量(全为零),而不是从以前的状态传递隐藏状态

我尝试了您的示例,并将优化器更改为Adam。有完整的代码

class AndRNN(nn.Module):
    def __init__(self):
        super(AndRNN, self).__init__()
        self.rnn = nn.RNN(1, 10, 5)
        self.fc = nn.Sequential(
            nn.Linear(10, 30),
            nn.Linear(30, 2)
        )

    def forward(self, input, hidden):
        x, hidden = self.rnn(input, hidden)
        x = self.fc(x[-1])        
        return x, hidden

    def initHidden(self):
        return Variable(torch.zeros((5, 1, 10)))

net = AndRNN()    
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(net.parameters())

correct = 0
for e in range(100):
    for i in range(len(data)):
        tensor = torch.FloatTensor(data[i]).view(-1, 1, 1)
        label = torch.LongTensor([labels[i]])
        hidden = net.initHidden()
        optimizer.zero_grad()

        out, hidden = net(Variable(tensor), hidden)

        loss = criterion(out, Variable(label))
        loss.backward()
        optimizer.step()
    if e % 25 == 0:
        print("Loss ", loss.data[0])
结果

Loss  0.6370733976364136
Loss  0.25336754322052
Loss  0.006924811284989119
Loss  0.002351854695007205

问题在于这一行:

out, hidden = net(Variable(tensor), Variable(hidden.data))
它应该是简单的

out, hidden = net(Variable(tensor), hidden)
通过使用
变量(hidden.data)
在这里,您可以在每一步创建一个新的隐藏状态变量(全为零),而不是从以前的状态传递隐藏状态

我尝试了您的示例,并将优化器更改为Adam。有完整的代码

class AndRNN(nn.Module):
    def __init__(self):
        super(AndRNN, self).__init__()
        self.rnn = nn.RNN(1, 10, 5)
        self.fc = nn.Sequential(
            nn.Linear(10, 30),
            nn.Linear(30, 2)
        )

    def forward(self, input, hidden):
        x, hidden = self.rnn(input, hidden)
        x = self.fc(x[-1])        
        return x, hidden

    def initHidden(self):
        return Variable(torch.zeros((5, 1, 10)))

net = AndRNN()    
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(net.parameters())

correct = 0
for e in range(100):
    for i in range(len(data)):
        tensor = torch.FloatTensor(data[i]).view(-1, 1, 1)
        label = torch.LongTensor([labels[i]])
        hidden = net.initHidden()
        optimizer.zero_grad()

        out, hidden = net(Variable(tensor), hidden)

        loss = criterion(out, Variable(label))
        loss.backward()
        optimizer.step()
    if e % 25 == 0:
        print("Loss ", loss.data[0])
结果

Loss  0.6370733976364136
Loss  0.25336754322052
Loss  0.006924811284989119
Loss  0.002351854695007205

谢谢,看起来很合理!谢谢,看起来很合理!