Python LSTM隐藏状态PyTorch几乎相同,从而导致负损失

Python LSTM隐藏状态PyTorch几乎相同,从而导致负损失,python,deep-learning,pytorch,lstm,Python,Deep Learning,Pytorch,Lstm,我一直在困惑这个问题,我不知道我做错了什么。。。 我已经培训了一个自动编码器(LSTM-LSTM),现在我正尝试使用KLDivLoss将编码功能用于另一项任务。但是,编码的特征几乎总是相同的(将打印精度设置为10时,请参见下面的示例): 聚类代码 类聚集器(nn.Module): def uuu init uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu 超级(聚类器,自).\uu初始化 self.

我一直在困惑这个问题,我不知道我做错了什么。。。 我已经培训了一个自动编码器(LSTM-LSTM),现在我正尝试使用
KLDivLoss
将编码功能用于另一项任务。但是,编码的特征几乎总是相同的(将打印精度设置为10时,请参见下面的示例):

聚类代码

类聚集器(nn.Module):
def uuu init uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu
超级(聚类器,自).\uu初始化
self.n_集群=n_集群
self.hidden\u dim=隐藏的\u dim
self.encoder=编码器
self.alpha=alpha
self.centroids=无
def初始_形心(自编码_x):
''使用KMeans初始化群集中心''
kmeans=kmeans(n_clusters=self.n_clusters,random_state=0,n_init=10).fit(encoded_x)
质心=火炬.张量(kmeans.cluster\u centers\u,dtype=火炬.float)
self.centroids=nn.参数(质心,需要_grad=True)
def目标分配(自身、q):
重量=(q_**2)/火炬总数(q_**0)
返回(weight.t()/torch.sum(weight,1)).t()
def转发(自编码_x):
num=((1+torch.norm(编码的_x.unsqueze(1)-self.centroids,dim=2))/self.alpha)**((self.alpha+1)/2)
den=torch.sum(num,dim=1,keepdim=True)
返回个数/个数
代码的最后一部分(出现问题的地方),这里
编码的\u数据
是形状
编码器的叠加输出(注意,隐藏的\u大小)

标准=nn.KLDivLoss(大小\u平均值=False)
...
clusterizer.init_形心(编码_数据)
输出=群集化器(编码数据到(设备))
target\u distrib=集群器。target\u分布(输出)
损失=标准(output.log(),target_distrib)
很抱歉,代码太多了,但我希望这有助于确定问题的根源

# last hidden state of the encoder
tensor([[[ 0.1086065620, -0.0446619801, -0.0530930459,  ...,
          -0.0573375113,  0.1083261892,  0.0037083717],
         [ 0.1086065620, -0.0446619801, -0.0530930459,  ...,
          -0.0573375151,  0.1083261892,  0.0037083712],
         [ 0.1086065620, -0.0446619801, -0.0530930459,  ...,
          -0.0573375188,  0.1083262041,  0.0037083719],
         ...,
         [ 0.1086065620, -0.0446619801, -0.0530930422,  ...,
          -0.0573375151,  0.1083262041,  0.0037083724],
         [ 0.1086065620, -0.0446619801, -0.0530930385,  ...,
          -0.0573375151,  0.1083262041,  0.0037083712],
         [ 0.1086065620, -0.0446619801, -0.0530930385,  ...,
          -0.0573375188,  0.1083261892,  0.0037083707]]],
       grad_fn=<StackBackward>)