Python RNN参数未更新?

Python RNN参数未更新?,python,machine-learning,deep-learning,recurrent-neural-network,pytorch,Python,Machine Learning,Deep Learning,Recurrent Neural Network,Pytorch,我对PyTorch非常陌生,对神经网络也相当陌生。 我试图建立一个可以猜测性别名字的神经网络,我根据PyTorch RNN教程判断国籍。 我让代码运行时没有错误,但损失几乎没有变化,这让我觉得权重没有更新… 这是我的输入/输出/目标张量设置的问题吗?或者我的训练功能有问题?我很失落,任何帮助都将不胜感激:冷汗: 这是我的密码: from __future__ import unicode_literals, print_function, division from io import op

我对PyTorch非常陌生,对神经网络也相当陌生。
我试图建立一个可以猜测性别名字的神经网络,我根据PyTorch RNN教程判断国籍。
我让代码运行时没有错误,但损失几乎没有变化,这让我觉得权重没有更新…
这是我的输入/输出/目标张量设置的问题吗?或者我的训练功能有问题?我很失落,任何帮助都将不胜感激:冷汗:
这是我的密码:

from __future__ import unicode_literals, print_function, division  
from io import open  
import glob  
import unicodedata  
import string  
import torch  
import torchvision  
import torch.nn as nn  
import torch.optim as optim  
import random  
from torch.autograd import Variable  

"""------GLOBAL VARIABLES------"""

all_letters = string.ascii_letters + " .,;'"
num_letters = len(all_letters)
all_names = {}
genders = ["Female", "Male"]

"""-------DATA EXTRACTION------"""

def findFiles(path):
    return glob.glob(path)

def unicodeToAscii(s):
    return ''.join(
        c for c in unicodedata.normalize('NFD', s)
        if unicodedata.category(c) != 'Mn'
        and c in all_letters
    )

# Read a file and split into lines
def readLines(filename):
    lines = open(filename, encoding='utf-8').read().strip().split('\n')
    return [unicodeToAscii(line) for line in lines]

for file in findFiles("/home/andrew/PyCharm/PycharmProjects/CantStop/data/names/*.txt"):
    gender = file.split("/")[-1].split(".")[0]
    names = readLines(file)
    all_names[gender] = names

"""-----DATA INTERPRETATION-----"""

def nameToTensor(name):
    tensor = torch.zeros(len(name), 1, num_letters)
    for index, letter in enumerate(name):
        tensor[index][0][all_letters.find(letter)] = 1
    return tensor

def outputToGender(output):
    gender, gender_index = output.data.topk(1)
    if gender_index[0][0] == 0:
        return "Female"
    return "Male"

"""------NETWORK SETUP------"""

class Net(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(Net, self).__init__()
        self.hidden_size = hidden_size
        #Layer 1
        self.Lin1 = nn.Linear(input_size+hidden_size, int((input_size+hidden_size)/2))
        self.ReLu1 = nn.ReLU()
        self.Batch1 = nn.BatchNorm1d(int((input_size+hidden_size)/2))
        #Layer 2
        self.Lin2 = nn.Linear(int((input_size+hidden_size)/2), output_size)
        self.ReLu2 = nn.ReLU()
        self.Batch2 = nn.BatchNorm1d(output_size)
        self.softMax = nn.LogSoftmax()
        #Hidden layer
        self.HidLin = nn.Linear(input_size+hidden_size, hidden_size)
        self.HidReLu = nn.ReLU()
        self.HidBatch = nn.BatchNorm1d(hidden_size)

    def forward(self, input, hidden):
        comb = torch.cat((input, hidden), 1)
        hidden = self.HidBatch(self.HidReLu(self.HidLin(comb)))
        output1 = self.Batch1(self.ReLu1(self.Lin1(comb)))
        output2 = self.softMax(self.Batch2(self.ReLu2(self.Lin2(output1))))
        return output2, hidden

    def initHidden(self):
        return Variable(torch.zeros(1, self.hidden_size))

NN = Net(num_letters, 128, 2)

"""------TRAINING------"""

def getRandomTrainingEx():
    gender = genders[random.randint(0, 1)]
    name = all_names[gender][random.randint(0, len(all_names[gender])-1)]
    gender_tensor = Variable(torch.LongTensor([genders.index(gender)]))
    name_tensor = Variable(nameToTensor(name))
    return gender_tensor, name_tensor, gender

def train(input, target):
    hidden = NN.initHidden()

    loss_func = nn.NLLLoss()

    alpha = 0.01

    NN.zero_grad()

    for i in range(input.size()[0]):
        output, hidden = NN(input[i], hidden)

    loss = loss_func(output, target)
    loss.backward()
    for w in NN.parameters():
        w.data.add_(-alpha, w.grad.data)

    return output, loss

for i in range(5000):
    gender_tensor, name_tensor, gender = getRandomTrainingEx()
    output, loss = train(name_tensor, gender_tensor)

    if i%500 == 0:
        print("Guess: %s, Correct: %s, Loss: %s" % (outputToGender(output), gender, loss.data[0]))
以下是输出:

Guess: Male, Correct: Male, Loss: 0.6931471824645996
Guess: Male, Correct: Female, Loss: 0.7400936484336853
Guess: Male, Correct: Male, Loss: 0.6755779385566711
Guess: Female, Correct: Female, Loss: 0.6648257374763489
Guess: Male, Correct: Male, Loss: 0.6765623688697815
Guess: Female, Correct: Male, Loss: 0.7330614924430847
Guess: Female, Correct: Female, Loss: 0.6565149426460266
Guess: Male, Correct: Female, Loss: 0.6946508884429932
Guess: Female, Correct: Female, Loss: 0.6621525287628174
Guess: Male, Correct: Male, Loss: 0.6662092804908752

Process finished with exit code 0

我建议您将
add
更改为
sub
。添加可能会使您远离最佳点

w.data.sub_(f.grad.data * alpha)
因为,在权重更新公式中有一个减法


顺便说一下,尝试将alpha增加/减少到0.1 0.05 0.01。如果alpha太大,它可能会错过最佳点。如果alpha值太小,则需要很长时间。

我建议您将
添加到
子项。添加可能会使您远离最佳点

w.data.sub_(f.grad.data * alpha)
因为,在权重更新公式中有一个减法


顺便说一下,尝试将alpha增加/减少到0.1 0.05 0.01。如果alpha太大,它可能会错过最佳点。如果alpha变小,则需要花费太长的时间。

您在训练时会得到什么样的损失和准确度?另外,大家好,欢迎来到Stack Overflow,请花点时间了解一下您在这里的道路(以及赢得您的第一枚徽章),阅读如何创建和检查,以增加获得反馈和有用答案的机会。您在培训过程中得到的损失和准确度是多少?另外,大家好,欢迎来到Stack Overflow,请花点时间浏览以了解您在这里的方式(以及赢得您的第一个徽章),阅读如何创建一个测试和检查,以增加获得反馈和有用答案的机会。改变学习率的好建议。让我担心的是,损失不会持续变化。。。它在0.6到0.5到0.7之间跳跃,诸如此类……也许您应该在代码中包含您的数据集,或者将您的数据集发布在这里(您可以用于文本文件)。所以我们可以调试你的代码。很好的改变学习速度。让我担心的是,损失不会持续变化。。。它在0.6到0.5到0.7之间跳跃,诸如此类……也许您应该在代码中包含您的数据集,或者将您的数据集发布在这里(您可以用于文本文件)。所以我们可以调试你的代码。