Python 将函数应用于特征，张量是不可破坏的。相反，使用tensor.ref（）作为键_Python_Tensorflow

Python 将函数应用于特征，张量是不可破坏的。相反，使用tensor.ref（）作为键

python tensorflow

Python 将函数应用于特征，张量是不可破坏的。相反，使用tensor.ref（）作为键,python,tensorflow,Python,Tensorflow,我试图对TensorFlow数据集应用一个退出函数，但在引用功能列的正确方法方面遇到了一些问题。如果只有一个输入，则函数按预期工作 import pandas as pd import tensorflow as tf import tensorflow_datasets as tfds from collections import Counter from tensorflow.keras.preprocessing.sequence import pad_sequences text

我试图对TensorFlow数据集应用一个退出函数，但在引用功能列的正确方法方面遇到了一些问题。如果只有一个输入，则函数按预期工作

import pandas as pd
import tensorflow as tf
import tensorflow_datasets as tfds
from collections import Counter
from tensorflow.keras.preprocessing.sequence import pad_sequences


text = ["I played it a while but it was alright. The steam was a bit of trouble."
        " The more they move these game to steam the more of a hard time I have"
        " activating and playing a game. But in spite of that it was fun, I "
        "liked it. Now I am looking forward to anno 2205 I really want to "
        "play my way to the moon.",
        "This game is a bit hard to get the hang of, but when you do it's great."]


df = pd.DataFrame({"text": text})

dataset = (
    tf.data.Dataset.from_tensor_slices(
        tf.cast(df.text.values, tf.string)))

tokenizer = tfds.features.text.Tokenizer()

lowercase = True
vocabulary = Counter()
for text in dataset:
    if lowercase:
        text = tf.strings.lower(text)
    tokens = tokenizer.tokenize(text.numpy())
    vocabulary.update(tokens)

vocab_size = 5000
vocabulary, _ = zip(*vocabulary.most_common(vocab_size))

max_len = 15
max_sent = 5
encoder = tfds.features.text.TokenTextEncoder(vocabulary,
                                              lowercase=True,
                                              tokenizer=tokenizer)

def encode(text):
    sent_list = []
    sents = tf.strings.split(text, sep=". ").numpy()
    if max_sent:
        sents = sents[:max_sent]
    for sent in sents:
        text_encoded = encoder.encode(sent.decode())
        if max_len:
            text_encoded = text_encoded[:max_len]
            sent_list.append(pad_sequences([text_encoded], max_len))
    if len(sent_list) < 5:
        sent_list.append([tf.zeros(max_len) for _ in range(5 - len(sent_list))])
    return tf.concat(sent_list, axis=0)

def encode_pyfn(text):
    [text_encoded] = tf.py_function(encode, inp=[text], Tout=[tf.int32])
    return text_encoded

dataset = dataset.map(encode_pyfn).batch(batch_size=2)

next(iter(dataset))

它提出了以下问题：

TypeError: in user code:

    <ipython-input-9-30172a796c2e>:69 encode_pyfn  *
        features['text'] = tf.py_function(encode, inp=features[text], Tout=[tf.int32])
    /Users/username/opt/anaconda3/lib/python3.8/site-packages/tensorflow/python/framework/ops.py:823 __hash__
        raise TypeError("Tensor is unhashable. "

    TypeError: Tensor is unhashable. Instead, use tensor.ref() as the key.

TypeError:在用户代码中：
：69编码_pyfn*
features['text']=tf.py_函数（encode，inp=features[text]，Tout=[tf.int32]）
/Users/username/opt/anaconda3/lib/python3.8/site packages/tensorflow/python/framework/ops.py:823\uu散列__
raise TypeError（“张量不可损坏。”
TypeError:Tensor不可修改。请改用Tensor.ref（）作为键。

将函数应用于单个功能的正确方法是什么？

您可能是指今天在这里编写

features['text']

（即添加单引号）

inp=features[text]

：）@谢谢您的关注！进行编辑会将错误消息更改为“TypeError:Expected list for'input”参数为“EcenterpyFunc”Op，而不是Tensor（“args_2:0”，shape=（），dtype=string）。`我尝试取消匹配（来自另一个问题的建议），但错误仍然存在。有什么想法吗？

inp

应该是一个张量列表，所以它应该是

inp=[features['text']]

。此外，由于您没有在这里使用unbatch，因此您正在处理一批样本；因此，请确保在实现

encode

函数时考虑了这一点。@scribbles，您的问题现在解决了吗？否则，您可以共享可执行代码来复制您的问题。谢谢您可能是指今天在这里编写

features['text']

（即添加单引号）

inp=features[text]

inp

应该是一个张量列表，所以它应该是

inp=[features['text']]

。此外，由于您没有在这里使用unbatch，因此您正在处理一批样本；因此，请确保在实现

encode

函数时考虑了这一点。@scribbles，您的问题现在解决了吗？否则，您可以共享可执行代码来复制您的问题。谢谢

TypeError: in user code:

    <ipython-input-9-30172a796c2e>:69 encode_pyfn  *
        features['text'] = tf.py_function(encode, inp=features[text], Tout=[tf.int32])
    /Users/username/opt/anaconda3/lib/python3.8/site-packages/tensorflow/python/framework/ops.py:823 __hash__
        raise TypeError("Tensor is unhashable. "

    TypeError: Tensor is unhashable. Instead, use tensor.ref() as the key.