Word2Vec教程:Tensorflow类型错误:输入';y';属于';Mul&x27;Op的float32类型与参数';x';

Word2Vec教程:Tensorflow类型错误:输入';y';属于';Mul&x27;Op的float32类型与参数';x';,tensorflow,typeerror,word2vec,Tensorflow,Typeerror,Word2vec,Tensorflow的版本:1.2.1 Python版本:3.5 操作系统:Windows 10 另一位发帖人在StackOverflow上询问了同样的问题,他似乎在使用同一个Udacity Word2Vec教程中的代码。所以,也许我很难理解,但是这个例子的代码非常繁忙和复杂,我不知道是什么解决了他的问题 调用tf时出错。reduce\u的意思是: loss = tf.reduce_mean( tf.nn.sampled_softmax_loss(softmax_weights, sof

Tensorflow的版本:1.2.1
Python版本:3.5
操作系统:Windows 10

另一位发帖人在StackOverflow上询问了同样的问题,他似乎在使用同一个Udacity Word2Vec教程中的代码。所以,也许我很难理解,但是这个例子的代码非常繁忙和复杂,我不知道是什么解决了他的问题

调用
tf时出错。reduce\u的意思是

loss = tf.reduce_mean(
    tf.nn.sampled_softmax_loss(softmax_weights, softmax_biases, embed,
                               train_labels, num_sampled, vocabulary_size))
在调用
tf.reduce\u mean
之前,关键变量具有以下数据类型

列车数据集.数据类型
>>tf.int32
列车标签。数据类型
>>tf.int32
有效的数据集。数据类型
>>tf.int32
嵌入。数据类型
>>tf.float32_ref
softmax_权重。数据类型
>>tf.float32_ref
softmax_偏差。数据类型
>>tf.float32_ref
嵌入.dtype
>>tf.32

我尝试了变量
train\u dataset.dtype
train\u标签.dtype
valid\u dataset.dtype
:使它们都
int64
,都
float32
,都
float64
,以及整数和浮点的组合。什么都没用。我没有尝试更改
softmax_-weight
softmax_-biases
的数据类型,因为我担心这可能会破坏优化算法。这些不需要是浮点数来支持反向传播过程中的演算吗?(Tensorflow通常是一个非常不透明的黑匣子,其文档几乎毫无用处,因此我可以怀疑一些事情,但永远无法确定。)

出错时的程序流:

调用
reduce\u mean
后,程序控制传输到文件
nn\u impl.py
中的
sampled\u softmax\u loss()

此时,我检查传入参数的数据类型,并获得以下结果:

权重.数据类型
>>tf.float32_ref
偏差。数据类型
>>tf.float32_ref
labels.dtype
>>tf.float32
输入。数据类型
>>tf.int32

在下一步发生异常,我被抛出到文件
ansitowin32.py
中的
StreamWrapper
类中。运行到最后,我得到以下回溯:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\op_def_library.py in apply_op(self, op_type_name, name, **keywords)
    489                 as_ref=input_arg.is_ref,
--> 490                 preferred_dtype=default_dtype)
    491           except TypeError as err:

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\ops.py in internal_convert_to_tensor(value, dtype, name, as_ref, preferred_dtype)
    740         if ret is None:
--> 741           ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
    742 

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\ops.py in _TensorTensorConversionFunction(t, dtype, name, as_ref)
    613         "Tensor conversion requested dtype %s for Tensor with dtype %s: %r"
--> 614         % (dtype.name, t.dtype.name, str(t)))
    615   return t

ValueError: Tensor conversion requested dtype int32 for Tensor with dtype float32: 'Tensor("sampled_softmax_loss/Reshape_1:0", shape=(?, 1, ?), dtype=float32, device=/device:CPU:0)'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-7-66d378b94a16> in <module>()
     34     loss = tf.reduce_mean(
     35       tf.nn.sampled_softmax_loss(softmax_weights, softmax_biases, embed,
---> 36                                train_labels, num_sampled, vocabulary_size))
     37 
     38     # Optimizer.

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\nn_impl.py in sampled_softmax_loss(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, remove_accidental_hits, partition_strategy, name)
   1266       remove_accidental_hits=remove_accidental_hits,
   1267       partition_strategy=partition_strategy,
-> 1268       name=name)
   1269   sampled_losses = nn_ops.softmax_cross_entropy_with_logits(labels=labels,
   1270                                                             logits=logits)

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\nn_impl.py in _compute_sampled_logits(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, subtract_log_q, remove_accidental_hits, partition_strategy, name)
   1005     row_wise_dots = math_ops.multiply(
   1006         array_ops.expand_dims(inputs, 1),
-> 1007         array_ops.reshape(true_w, new_true_w_shape))
   1008     # We want the row-wise dot plus biases which yields a
   1009     # [batch_size, num_true] tensor of true_logits.

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\math_ops.py in multiply(x, y, name)
    284 
    285 def multiply(x, y, name=None):
--> 286   return gen_math_ops._mul(x, y, name)
    287 
    288 

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\gen_math_ops.py in _mul(x, y, name)
   1375     A `Tensor`. Has the same type as `x`.
   1376   """
-> 1377   result = _op_def_lib.apply_op("Mul", x=x, y=y, name=name)
   1378   return result
   1379 

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\op_def_library.py in apply_op(self, op_type_name, name, **keywords)
    524                   "%s type %s of argument '%s'." %
    525                   (prefix, dtypes.as_dtype(attrs[input_arg.type_attr]).name,
--> 526                    inferred_from[input_arg.type_attr]))
    527 
    528           types = [values.dtype]

TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type int32 of argument 'x'.
---------------------------------------------------------------------------
ValueError回溯(最近一次调用上次)
C:\Anaconda3\envs\aind dog\lib\site packages\tensorflow\python\framework\op_def_library.py in apply_op(self,op_type_name,name,**关键字)
489 as_ref=输入参数为,
-->490首选类型=默认类型)
491除类型错误作为错误外:
C:\Anaconda3\envs\aind dog\lib\site packages\tensorflow\python\framework\ops.py in internal\u convert\u to\u tensor(值、数据类型、名称、as\u ref、首选\u数据类型)
740如果ret为无:
-->741 ret=conversion\u func(值,dtype=dtype,name=name,as\u ref=as\u ref)
742
C:\Anaconda3\envs\aind dog\lib\site packages\tensorflow\python\framework\ops.py in\u TensorTensorConversionFunction(t,dtype,name,as\u ref)
613“对于数据类型为%s的张量,张量转换请求的数据类型为%s:%r”
-->614%(dtype.name,t.dtype.name,str(t)))
615返回t
ValueError:Tensor转换为具有dtype float32的Tensor请求了dtype int32:“Tensor(“采样的\u softmax\u丢失/重塑\u 1:0”,形状=(?,1,?),dtype=float32,设备=/设备:CPU:0)”
在处理上述异常期间,发生了另一个异常:
TypeError回溯(最近一次调用上次)
在()
34损失=tf.reduce_平均值(
35 tf.nn.采样的softmax_损耗(softmax_权重、softmax_偏差、嵌入、,
--->36列(标签、抽样数量、词汇(尺寸))
37
38#优化器。
C:\Anaconda3\envs\aind dog\lib\site packages\tensorflow\python\ops\nn\u impl.py in sampled\u softmax\u loss(权重、偏差、标签、输入、采样数、num\u类、num\u真、采样值、删除意外点击、分区策略、名称)
1266移除意外命中=移除意外命中,
1267分区策略=分区策略,
->1268名称=名称)
1269采样损耗=nn运算。softmax交叉熵与逻辑(标签=标签,
1270 logits=logits)
C:\Anaconda3\envs\aind dog\lib\site packages\tensorflow\python\ops\nn\u impl.py in\u compute\u sampled\u logits(权重、偏差、标签、输入、num\u sampled、num\u类、num\u true、sampled\u值、减去log\u q、删除意外命中、分区策略、名称)
1005行点=数学运算乘法(
1006阵列操作扩展dims(输入,1),
->1007阵列操作重塑(真形、新形)
1008#我们需要行方向的点加上偏差,从而产生
1009[batch_size,num_true]真逻辑的张量。
C:\Anaconda3\envs\aind dog\lib\site packages\tensorflow\python\ops\math\u ops.py乘(x,y,name)
284
285 def倍增(x,y,name=None):
-->286返回gen_math_ops._mul(x,y,name)
287
288
C:\Anaconda3\envs\aind dog\lib\site packages\tensorflow\python\ops\gen\u math\u ops.py in\u mul(x,y,name)
1375 A‘张量’。与“x”的类型相同。
1376   """
->1377结果=_op_def_lib.apply_op(“Mul”,x=x,y=y,name=name)
1378返回结果
1379
C:\Anaconda3\envs\aind dog\lib\site packages\tensorflow\python\framework\op_def_library.py in apply_op(self,op_type_name,name,**关键字)
524“%s”类型%s,参数“%s”。%
525(前缀,dtypes.as_dtype(attrs[input_arg.type_attr])。名称,
-->526根据[输入参数类型属性])推断
527
528
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\op_def_library.py in apply_op(self, op_type_name, name, **keywords)
    489                 as_ref=input_arg.is_ref,
--> 490                 preferred_dtype=default_dtype)
    491           except TypeError as err:

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\ops.py in internal_convert_to_tensor(value, dtype, name, as_ref, preferred_dtype)
    740         if ret is None:
--> 741           ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
    742 

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\ops.py in _TensorTensorConversionFunction(t, dtype, name, as_ref)
    613         "Tensor conversion requested dtype %s for Tensor with dtype %s: %r"
--> 614         % (dtype.name, t.dtype.name, str(t)))
    615   return t

ValueError: Tensor conversion requested dtype int32 for Tensor with dtype float32: 'Tensor("sampled_softmax_loss/Reshape_1:0", shape=(?, 1, ?), dtype=float32, device=/device:CPU:0)'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-7-66d378b94a16> in <module>()
     34     loss = tf.reduce_mean(
     35       tf.nn.sampled_softmax_loss(softmax_weights, softmax_biases, embed,
---> 36                                train_labels, num_sampled, vocabulary_size))
     37 
     38     # Optimizer.

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\nn_impl.py in sampled_softmax_loss(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, remove_accidental_hits, partition_strategy, name)
   1266       remove_accidental_hits=remove_accidental_hits,
   1267       partition_strategy=partition_strategy,
-> 1268       name=name)
   1269   sampled_losses = nn_ops.softmax_cross_entropy_with_logits(labels=labels,
   1270                                                             logits=logits)

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\nn_impl.py in _compute_sampled_logits(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, subtract_log_q, remove_accidental_hits, partition_strategy, name)
   1005     row_wise_dots = math_ops.multiply(
   1006         array_ops.expand_dims(inputs, 1),
-> 1007         array_ops.reshape(true_w, new_true_w_shape))
   1008     # We want the row-wise dot plus biases which yields a
   1009     # [batch_size, num_true] tensor of true_logits.

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\math_ops.py in multiply(x, y, name)
    284 
    285 def multiply(x, y, name=None):
--> 286   return gen_math_ops._mul(x, y, name)
    287 
    288 

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\gen_math_ops.py in _mul(x, y, name)
   1375     A `Tensor`. Has the same type as `x`.
   1376   """
-> 1377   result = _op_def_lib.apply_op("Mul", x=x, y=y, name=name)
   1378   return result
   1379 

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\op_def_library.py in apply_op(self, op_type_name, name, **keywords)
    524                   "%s type %s of argument '%s'." %
    525                   (prefix, dtypes.as_dtype(attrs[input_arg.type_attr]).name,
--> 526                    inferred_from[input_arg.type_attr]))
    527 
    528           types = [values.dtype]

TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type int32 of argument 'x'.
# These are all the modules we'll be using later. 
# Make sure you can import them before proceeding further.

# %matplotlib inline

from __future__ import print_function
import collections
import math
import numpy as np
import os
import random
import tensorflow as tf
import zipfile
from matplotlib import pylab
from six.moves import range
from six.moves.urllib.request import urlretrieve
from sklearn.manifold import TSNE

print("Working directory = %s\n" % os.getcwd())

def read_data(filename):
    """Extract the first file enclosed in a zip file as a list of words"""
    with zipfile.ZipFile(filename) as f:
        data = tf.compat.as_str(f.read(f.namelist()[0])).split()
    return data

filename = 'text8.zip'

words = read_data(filename)
print('Data size %d' % len(words))

vocabulary_size = 50000

def build_dataset(words):
    count = [['UNK', -1]]
    count.extend(collections.Counter(words).most_common(vocabulary_size - 1))
    dictionary = dict()
    # Loop through the keys of the count collection dictionary
    # (apparently, zeroing out counts)
    for word, _ in count:
        dictionary[word] = len(dictionary)
    data = list()
    unk_count = 0  # count of unknown words
    for word in words:
        if word in dictionary:
            index = dictionary[word]
        else:
            index = 0  # dictionary['UNK']
            unk_count = unk_count + 1
        data.append(index)
    count[0][1] = unk_count
    reverse_dictionary = dict(zip(dictionary.values(), dictionary.keys()))
    return data, count, dictionary, reverse_dictionary


data, count, dictionary, reverse_dictionary = build_dataset(words)
print('Most common words (+UNK)', count[:5])
print('Sample data', data[:10])
del words  # Hint to reduce memory.

data_index = 0

def generate_batch(batch_size, num_skips, skip_window):
    global data_index
    assert batch_size % num_skips == 0
    assert num_skips <= 2 * skip_window
    batch = np.ndarray(shape=(batch_size), dtype=np.int32)
    labels = np.ndarray(shape=(batch_size, 1), dtype=np.int32)
    span = 2 * skip_window + 1 # [ skip_window target skip_window ]
    buffer = collections.deque(maxlen=span)
    for _ in range(span):
        buffer.append(data[data_index])
        data_index = (data_index + 1) % len(data)
    for i in range(batch_size // num_skips):
        target = skip_window  # target label at the center of the buffer
        targets_to_avoid = [ skip_window ]
        for j in range(num_skips):
            while target in targets_to_avoid:
                target = random.randint(0, span - 1)
            targets_to_avoid.append(target)
            batch[i * num_skips + j] = buffer[skip_window]
            labels[i * num_skips + j, 0] = buffer[target]
        buffer.append(data[data_index])
        data_index = (data_index + 1) % len(data)
    return batch, labels

print('data:', [reverse_dictionary[di] for di in data[:8]])

for num_skips, skip_window in [(2, 1), (4, 2)]:
    data_index = 0
    batch, labels = generate_batch(batch_size=8, num_skips=num_skips, skip_window=skip_window)
    print('\nwith num_skips = %d and skip_window = %d:' % (num_skips, skip_window))
    print('    batch:', [reverse_dictionary[bi] for bi in batch])
    print('    labels:', [reverse_dictionary[li] for li in labels.reshape(8)])

batch_size = 128
embedding_size = 128  # Dimension of the embedding vector.
skip_window = 1  # How many words to consider left and right.
num_skips = 2  # How many times to reuse an input to generate a label.
# We pick a random validation set to sample nearest neighbors. here we limit the
# validation samples to the words that have a low numeric ID, which by
# construction are also the most frequent.
valid_size = 16  # Random set of words to evaluate similarity on.
valid_window = 100  # Only pick dev samples in the head of the distribution.
valid_examples = np.array(random.sample(range(valid_window), valid_size))
num_sampled = 64  # Number of negative examples to sample.

graph = tf.Graph()

with graph.as_default(), tf.device('/cpu:0'):
    # Input data.
    train_dataset = tf.placeholder(tf.int32, shape=[batch_size])
    train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1])
    valid_dataset = tf.constant(valid_examples, dtype=tf.int32)

    # Variables.
    embeddings = tf.Variable(
        tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0))
    softmax_weights = tf.Variable(
        tf.truncated_normal([vocabulary_size, embedding_size],
                            stddev=1.0 / math.sqrt(embedding_size)))
    softmax_biases = tf.Variable(tf.zeros([vocabulary_size]))

    # Model.
    # Look up embeddings for inputs.
    embed = tf.nn.embedding_lookup(embeddings, train_dataset)
    # Compute the softmax loss, using a sample of the negative labels each time.
    loss = tf.reduce_mean(
        tf.nn.sampled_softmax_loss(softmax_weights, softmax_biases, embed,
                                   train_labels, num_sampled, vocabulary_size))

    # Optimizer.
    # Note: The optimizer will optimize the softmax_weights AND the embeddings.
    # This is because the embeddings are defined as a variable quantity and the
    # optimizer's `minimize` method will by default modify all variable quantities
    # that contribute to the tensor it is passed.
    # See docs on `tf.train.Optimizer.minimize()` for more details.
    optimizer = tf.train.AdagradOptimizer(1.0).minimize(loss)

    # Compute the similarity between minibatch examples and all embeddings.
    # We use the cosine distance:
    norm = tf.sqrt(tf.reduce_sum(tf.square(embeddings), 1, keep_dims=True))
    normalized_embeddings = embeddings / norm
    valid_embeddings = tf.nn.embedding_lookup(
        normalized_embeddings, valid_dataset)
    similarity = tf.matmul(valid_embeddings, tf.transpose(normalized_embeddings))
sampled_softmax_loss(
    weights,
    biases,
    labels,
    inputs,
    num_sampled,
    num_classes,
    num_true=1,
    sampled_values=None,
    remove_accidental_hits=True,
    partition_strategy='mod',
    name='sampled_softmax_loss'
)