Python 如何在tf.map\u fn的不同分支中使用不同的卷积层?

Python 如何在tf.map\u fn的不同分支中使用不同的卷积层?,python,tensorflow,deep-learning,conv-neural-network,Python,Tensorflow,Deep Learning,Conv Neural Network,我试图在tensorflow1.14中建立一个简单的多头注意层。每个头部包含三个不同的conv1d层。我想用tf.map_fn来并行计算 import tensorflow as tf n_head = 50 # heads counts conv1d = tf.layers.conv1d normalize = tf.contrib.layers.instance_norm activation = tf.nn.elu f1d = tf.placeholder(shape=(None,

我试图在tensorflow1.14中建立一个简单的多头注意层。每个头部包含三个不同的
conv1d
层。我想用tf.map_fn来并行计算

import tensorflow as tf

n_head = 50 # heads counts

conv1d = tf.layers.conv1d
normalize = tf.contrib.layers.instance_norm 
activation = tf.nn.elu

f1d = tf.placeholder(shape=(None, 42),dtype=tf.float32) # input feats

f1ds = tf.tile(f1d[None, ...], [n_head, 1, 1]) # n_head copys to apply different attention heads

def apply_attention(f1):
    f1 = activation(normalize(f1[None, ...]))
    q = conv1d(f1, 32, 3, padding='same')
    k = conv1d(f1, 32, 3, padding='same')  # [1,ncol, 32]
    v = conv1d(f1, 1, 3, padding='same')  # [1,ncol, 1]
    attention_map = tf.nn.softmax(tf.reduce_sum(q[0, None, :, :] * k[0, :, None, :], axis=-1) / (32 ** .5),
                                  axis=0)  # [ncol,ncol]
    return attention_map * v[0]

f1d_attention = tf.map_fn(lambda x: apply_attention(x), f1ds, dtype=tf.float32) 
但是当我检测到这个模型中的变量时,整个模型中似乎只有一组
conv1d

conv1d/bias/Adam [32]
conv1d/bias/Adam_1 [32]
conv1d/kernel [3, 42, 32]
conv1d/kernel/Adam [3, 42, 32]
conv1d/kernel/Adam_1 [3, 42, 32]
conv1d_1/bias [32]
conv1d_1/bias/Adam [32]
conv1d_1/bias/Adam_1 [32]
conv1d_1/kernel [3, 42, 32]
conv1d_1/kernel/Adam [3, 42, 32]
conv1d_1/kernel/Adam_1 [3, 42, 32]
conv1d_2/bias [1]
conv1d_2/bias/Adam [1]
conv1d_2/bias/Adam_1 [1]
conv1d_2/kernel [3, 42, 1]
conv1d_2/kernel/Adam [3, 42, 1]
conv1d_2/kernel/Adam_1 [3, 42, 1]

我的代码怎么了?

在这种情况下,您不想使用
tf.map\fn
tf.map_fn
将对函数进行一次评估,并通过相同的函数运行不同的输入,有效地为每个输入使用相同的卷积层

您可以通过一个简单的for循环实现您想要的:

# Creating a different set of conv for each head
multi_head = [apply_attention(f1d) for _ in range(n_head)]
# stacking the result together on the first axis
f1d_attention = tf.stack(multi_head, axis=0)
为了可见性,我已经将磁头的数量减少到2,但是如果我们查看变量,我们会看到两组卷积已经被实例化

>>> tf.global_variables()
[<tf.Variable 'InstanceNorm/beta:0' shape=(42,) dtype=float32_ref>,
 <tf.Variable 'InstanceNorm/gamma:0' shape=(42,) dtype=float32_ref>,
 <tf.Variable 'conv1d/kernel:0' shape=(3, 42, 32) dtype=float32_ref>,
 <tf.Variable 'conv1d/bias:0' shape=(32,) dtype=float32_ref>,
 <tf.Variable 'conv1d_1/kernel:0' shape=(3, 42, 32) dtype=float32_ref>,
 <tf.Variable 'conv1d_1/bias:0' shape=(32,) dtype=float32_ref>,
 <tf.Variable 'conv1d_2/kernel:0' shape=(3, 42, 1) dtype=float32_ref>,
 <tf.Variable 'conv1d_2/bias:0' shape=(1,) dtype=float32_ref>,
 <tf.Variable 'InstanceNorm_1/beta:0' shape=(42,) dtype=float32_ref>,
 <tf.Variable 'InstanceNorm_1/gamma:0' shape=(42,) dtype=float32_ref>,
 <tf.Variable 'conv1d_3/kernel:0' shape=(3, 42, 32) dtype=float32_ref>,
 <tf.Variable 'conv1d_3/bias:0' shape=(32,) dtype=float32_ref>,
 <tf.Variable 'conv1d_4/kernel:0' shape=(3, 42, 32) dtype=float32_ref>,
 <tf.Variable 'conv1d_4/bias:0' shape=(32,) dtype=float32_ref>,
 <tf.Variable 'conv1d_5/kernel:0' shape=(3, 42, 1) dtype=float32_ref>,
 <tf.Variable 'conv1d_5/bias:0' shape=(1,) dtype=float32_ref>]
>>tf.global_variables()
[,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
]

旁注:除非您有很好的理由,否则您应该从TensorFlow 1迁移到TensorFlow 2。对TF1的支持是有限的