Tensorflow嵌入查找梯度在CPU上注册？_Tensorflow_Gpu

Tensorflow嵌入查找梯度在CPU上注册？

tensorflow

Tensorflow嵌入查找梯度在CPU上注册？,tensorflow,gpu,Tensorflow,Gpu,我有一个相当于稀疏softmax的东西： ... with tf.device('/gpu:0'): indices = tf.placeholder(tf.int32, [None, dimsize]) self._W = weight_variable([self._num_nodes, input_layer_size]) self._b = bias_variable([self._num_nodes]) sampled_W = tf.transpose(

我有一个相当于稀疏softmax的东西：

...
with tf.device('/gpu:0'):
    indices = tf.placeholder(tf.int32, [None, dimsize])
    self._W = weight_variable([self._num_nodes, input_layer_size])
    self._b = bias_variable([self._num_nodes])
    sampled_W = tf.transpose(tf.nn.embedding_lookup(self._W, indices), [0,2,1]) # [batchsize, inputlayersize, dim1size]
    sampled_b = tf.nn.embedding_lookup(self._b, indices) # [batchsize, dim1size]
    ...

但是，当我启用放置日志记录时，我看到多个渐变实例被放置在CPU上，例如：

gradients/.../embedding_lookup_1_grad/Size: /job:localhost/replica:0/task:0/cpu:0

I tensorflow/core/common_runtime/simple_placer.cc:819] gradients/.../embedding_lookup_1_grad/Size: /job:localhost/replica:0/task:0/cpu:0

无论我选择哪个优化器，这种情况都会发生。我在这里遗漏了什么吗？

如果您使用

tf.Session(config=tf.ConfigProto(allow_soft_placement=False))

你应该得到一个错误。这是因为目前GPU上没有实现

embedded\u lookup

。

如果这种情况发生在GPU上，使用

embedded\u lookup

比使用

collect

有什么好处吗？这是一个很好的观点。事实上我自己也在尝试。这似乎没有任何好处。问题解决了@卫斯理坦西，我也有类似的问题。那么，您通过将嵌入查找替换为聚集来解决您的问题？我说得对吗？是的。如果你能在GPU上把所有的东西都放在内存中，你只需要收集就可以了。请注意，gather\u nd不支持GPU。例如，请参见此处的LocallySmoothedMultiscaleLayer类：那现在呢？@MeadowMuffins在Pytorch发布后，我还没有真正跟上TF的发展，所以我不能说。