Machine learning 什么是；源隐藏状态“；请参阅注意机制中的？_Machine Learning_Nlp_Deep Learning_Sequence To Sequence_Attention Model

Machine learning 什么是；源隐藏状态“；请参阅注意机制中的？

machine-learning nlp deep-learning

Machine learning 什么是；源隐藏状态“；请参阅注意机制中的？,machine-learning,nlp,deep-learning,sequence-to-sequence,attention-model,Machine Learning,Nlp,Deep Learning,Sequence To Sequence,Attention Model,注意力权重计算如下：我想知道hus指的是什么在tensorflow代码中，编码器RNN返回一个元组： encoder_outputs, encoder_state = tf.nn.dynamic_rnn(...) 我认为，h\u s应该是encoder\u状态，但是给出了不同的答案 # attention_states: [batch_size, max_time, num_units] attention_states = tf.transpose(encoder_outputs, [

注意力权重计算如下：

我想知道

hus

指的是什么

在tensorflow代码中，编码器RNN返回一个元组：

encoder_outputs, encoder_state = tf.nn.dynamic_rnn(...)

我认为，

h\u s

应该是

encoder\u状态

，但是给出了不同的答案

# attention_states: [batch_size, max_time, num_units]
attention_states = tf.transpose(encoder_outputs, [1, 0, 2])

# Create an attention mechanism
attention_mechanism = tf.contrib.seq2seq.LuongAttention(
    num_units, attention_states,
    memory_sequence_length=source_sequence_length)

我误解代码了吗？或者

h\u s

实际上是指

编码器输出的？
公式可能来自，因此我将使用同一篇文章中的NN图片：

这里，h-bar
是来自编码器（最后一层）的所有蓝色隐藏状态，h（t）
是来自解码器（也是最后一层）的当前红色隐藏状态。在图片t=0
中，您可以看到哪些块通过虚线箭头连接到注意权重。score
功能通常是以下功能之一：


Tensorflow注意机制与这张图片相匹配。理论上，细胞输出在大多数情况下是其隐藏状态（LSTM细胞除外，在LSTM细胞中，输出是状态的短期部分，即使在这种情况下，输出也更适合注意机制）。实际上，当输入用零填充时，tensorflow的encoder_状态
不同于encoder_输出
：当输出为零时，状态从上一个单元格状态传播。显然，您不想关注尾随的零，因此为这些单元格设置h-bar
是有意义的
因此，encoder\u输出
正是从蓝色块向上的箭头。在随后的代码中，注意机制
连接到每个解码器单元
，以便其输出通过上下文向量到达图片上的黄色块
decoder_cell = tf.contrib.seq2seq.AttentionWrapper(
    decoder_cell, attention_mechanism,
    attention_layer_size=num_units)