Machine learning LSTM后接平均池_Machine Learning_Neural Network_Deep Learning_Keras_Recurrent Neural Network

Machine learning LSTM后接平均池

machine-learning neural-network deep-learning keras

Machine learning LSTM后接平均池,machine-learning,neural-network,deep-learning,keras,recurrent-neural-network,Machine Learning,Neural Network,Deep Learning,Keras,Recurrent Neural Network,我正在使用Keras 1.0。我的问题与这个问题相同（），但那里的答案对我来说似乎不够我想实现这个网络：以下代码不起作用： sequence = Input(shape=(max_sent_len,), dtype='int32') embedded = Embedding(vocab_size, word_embedding_size)(sequence) lstm = LSTM(hidden_state_size, activation='sigmoid', inner_activat

我正在使用Keras 1.0。我的问题与这个问题相同（），但那里的答案对我来说似乎不够

我想实现这个网络：

以下代码不起作用：

sequence = Input(shape=(max_sent_len,), dtype='int32')
embedded = Embedding(vocab_size, word_embedding_size)(sequence)
lstm = LSTM(hidden_state_size, activation='sigmoid', inner_activation='hard_sigmoid', return_sequences=True)(embedded)
pool = AveragePooling1D()(lstm)
output = Dense(1, activation='sigmoid')(pool)

如果我没有设置

return\u sequences=True

，我在调用

averagepoolg1d（）

时会出现此错误：

添加

TimeDistributed（密集（1））

有助于：

sequence = Input(shape=(max_sent_len,), dtype='int32')
embedded = Embedding(vocab_size, word_embedding_size)(sequence)
lstm = LSTM(hidden_state_size, activation='sigmoid', inner_activation='hard_sigmoid', return_sequences=True)(embedded)
distributed = TimeDistributed(Dense(1))(lstm)
pool = AveragePooling1D()(distributed)
output = Dense(1, activation='sigmoid')(pool)

我认为公认的答案基本上是错误的。在以下位置找到了解决方案：但是，它只适用于theano后端。我修改了代码，使其同时支持theano和tensorflow

from keras.engine.topology import Layer, InputSpec
from keras import backend as T

class TemporalMeanPooling(Layer):
    """
This is a custom Keras layer. This pooling layer accepts the temporal
sequence output by a recurrent layer and performs temporal pooling,
looking at only the non-masked portion of the sequence. The pooling
layer converts the entire variable-length hidden vector sequence
into a single hidden vector, and then feeds its output to the Dense
layer.

input shape: (nb_samples, nb_timesteps, nb_features)
output shape: (nb_samples, nb_features)
"""
def __init__(self, **kwargs):
    super(TemporalMeanPooling, self).__init__(**kwargs)
    self.supports_masking = True
    self.input_spec = [InputSpec(ndim=3)]

def get_output_shape_for(self, input_shape):
    return (input_shape[0], input_shape[2])

def call(self, x, mask=None): #mask: (nb_samples, nb_timesteps)
    if mask is None:
        mask = T.mean(T.ones_like(x), axis=-1)
    ssum = T.sum(x,axis=-2) #(nb_samples, np_features)
    mask = T.cast(mask,T.floatx())
    rcnt = T.sum(mask,axis=-1,keepdims=True) #(nb_samples)
    return ssum/rcnt
    #return rcnt

def compute_mask(self, input, mask):
    return None

谢谢，我也遇到了这个问题，但是我认为时间分布层没有按照你想要的那样工作，你可以试试Luke Guye的临时意义池层，它对我很有用。以下是一个例子：

sequence = Input(shape=(max_sent_len,), dtype='int32')
embedded = Embedding(vocab_size, word_embedding_size)(sequence)
lstm = LSTM(hidden_state_size, return_sequences=True)(embedded)
pool = TemporalMeanPooling()(lstm)
output = Dense(1, activation='sigmoid')(pool)

我刚刚尝试实现与原始海报相同的模型，我使用的是

keras2.0.3

。当我使用

globalaveragepoolg1d

时，LSTM后的平均池工作正常，只需确保LSTM层中的

return\u sequences=True

。试试看

派对已经很晚了，但是

tf.keras.layers.averagepoolg1d

以及合适的

pool\u size

参数似乎也返回了正确的结果

正在处理由共享的示例

#创建示例数据
A=np.数组（[[1,2,3]，[4,5,6]，[0,0,0]，[0,0,0]，[0,0,0]）
B=np.数组（[[1,3,0]，[4,0,0]，[0,0,1]，[0,0,0]，[0,0,0]）
C=np.array（[A，B]）.astype（“float32”）
#预期答案（对于时间平均值）
平均值（C，轴=1）

输出是

数组（[[1,1.4,1.8]，
[1,0.6,0.2]]，数据类型=32）

现在使用

averagepoolg1d

model=keras.models.Sequential(
tf.keras.layers.AveragePoolg1d（池大小=5）
)
模型预测（C）

输出是,

数组（[[1,1.4,1.8]]，
[[1,0.6,0.2]]，数据类型=float32）

<有些>要考虑，

```
pool_size
```
应等于重复层的步长/时间步长大小
输出的形状为
```
（批量大小、下采样步骤、特征）
```
，其中包含一个额外的
```
下采样步骤
```
维度。如果将循环层中的
```
pool_size
```
设置为等于timestep size，则该值将始终为1

from keras.engine.topology import Layer, InputSpec
from keras import backend as T

class TemporalMeanPooling(Layer):
    """
This is a custom Keras layer. This pooling layer accepts the temporal
sequence output by a recurrent layer and performs temporal pooling,
looking at only the non-masked portion of the sequence. The pooling
layer converts the entire variable-length hidden vector sequence
into a single hidden vector, and then feeds its output to the Dense
layer.

input shape: (nb_samples, nb_timesteps, nb_features)
output shape: (nb_samples, nb_features)
"""
def __init__(self, **kwargs):
    super(TemporalMeanPooling, self).__init__(**kwargs)
    self.supports_masking = True
    self.input_spec = [InputSpec(ndim=3)]

def get_output_shape_for(self, input_shape):
    return (input_shape[0], input_shape[2])

def call(self, x, mask=None): #mask: (nb_samples, nb_timesteps)
    if mask is None:
        mask = T.mean(T.ones_like(x), axis=-1)
    ssum = T.sum(x,axis=-2) #(nb_samples, np_features)
    mask = T.cast(mask,T.floatx())
    rcnt = T.sum(mask,axis=-1,keepdims=True) #(nb_samples)
    return ssum/rcnt
    #return rcnt

def compute_mask(self, input, mask):
    return None

sequence = Input(shape=(max_sent_len,), dtype='int32')
embedded = Embedding(vocab_size, word_embedding_size)(sequence)
lstm = LSTM(hidden_state_size, return_sequences=True)(embedded)
pool = TemporalMeanPooling()(lstm)
output = Dense(1, activation='sigmoid')(pool)