keras.layers.bidirective后的参数数量是否增加了一倍?

keras.layers.bidirective后的参数数量是否增加了一倍?,keras,keras-layer,Keras,Keras Layer,以下是代码和结果。有两种型号:一种是双向的。我的问题是,为什么参数264 time\u distributed\u 14 TimeDis不是time\u distributed\u 13 TimeDis 136的两倍?我知道264=136*2-8。为什么我们需要-8呢 结果: _________________________________________________________________ Layer (type) Output Shape

以下是代码和结果。有两种型号:一种是双向的。我的问题是,为什么参数264 time\u distributed\u 14 TimeDis不是time\u distributed\u 13 TimeDis 136的两倍?我知道264=136*2-8。为什么我们需要-8呢

结果:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
gru_9 (GRU)                  (None, 64, 16)            1536      
_________________________________________________________________
time_distributed_13 (TimeDis (None, 64, 8)             136       
_________________________________________________________________
activation_6 (Activation)    (None, 64, 8)             0         
=================================================================
Total params: 1,672
Trainable params: 1,672
Non-trainable params: 0
_________________________________________________________________
None
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
bidirectional_7 (Bidirection (None, 64, 32)            3072      
_________________________________________________________________
time_distributed_14 (TimeDis (None, 64, 8)             264       
_________________________________________________________________
activation_7 (Activation)    (None, 64, 8)             0         
=================================================================
Total params: 3,336
Trainable params: 3,336
Non-trainable params: 0
_________________________________________________________________
None

不仅有权重,还有偏差,偏差完全忽略了输入

weights = input * output 
       - regular: = 16*8 = 128
       - bidirec: = 32*8 = 256

biases = output
       - regular: = 8
       - bidirec: = 8

parameters = weights + biases
       - regular: = 128 + 8 = 136
       - bidirec: = 256 + 8 = 264
weights = input * output 
       - regular: = 16*8 = 128
       - bidirec: = 32*8 = 256

biases = output
       - regular: = 8
       - bidirec: = 8

parameters = weights + biases
       - regular: = 128 + 8 = 136
       - bidirec: = 256 + 8 = 264