Python 将函数应用于特征,张量是不可破坏的。相反,使用tensor.ref()作为键
我试图对TensorFlow数据集应用一个退出函数,但在引用功能列的正确方法方面遇到了一些问题。如果只有一个输入,则函数按预期工作Python 将函数应用于特征,张量是不可破坏的。相反,使用tensor.ref()作为键,python,tensorflow,Python,Tensorflow,我试图对TensorFlow数据集应用一个退出函数,但在引用功能列的正确方法方面遇到了一些问题。如果只有一个输入,则函数按预期工作 import pandas as pd import tensorflow as tf import tensorflow_datasets as tfds from collections import Counter from tensorflow.keras.preprocessing.sequence import pad_sequences text
import pandas as pd
import tensorflow as tf
import tensorflow_datasets as tfds
from collections import Counter
from tensorflow.keras.preprocessing.sequence import pad_sequences
text = ["I played it a while but it was alright. The steam was a bit of trouble."
" The more they move these game to steam the more of a hard time I have"
" activating and playing a game. But in spite of that it was fun, I "
"liked it. Now I am looking forward to anno 2205 I really want to "
"play my way to the moon.",
"This game is a bit hard to get the hang of, but when you do it's great."]
df = pd.DataFrame({"text": text})
dataset = (
tf.data.Dataset.from_tensor_slices(
tf.cast(df.text.values, tf.string)))
tokenizer = tfds.features.text.Tokenizer()
lowercase = True
vocabulary = Counter()
for text in dataset:
if lowercase:
text = tf.strings.lower(text)
tokens = tokenizer.tokenize(text.numpy())
vocabulary.update(tokens)
vocab_size = 5000
vocabulary, _ = zip(*vocabulary.most_common(vocab_size))
max_len = 15
max_sent = 5
encoder = tfds.features.text.TokenTextEncoder(vocabulary,
lowercase=True,
tokenizer=tokenizer)
def encode(text):
sent_list = []
sents = tf.strings.split(text, sep=". ").numpy()
if max_sent:
sents = sents[:max_sent]
for sent in sents:
text_encoded = encoder.encode(sent.decode())
if max_len:
text_encoded = text_encoded[:max_len]
sent_list.append(pad_sequences([text_encoded], max_len))
if len(sent_list) < 5:
sent_list.append([tf.zeros(max_len) for _ in range(5 - len(sent_list))])
return tf.concat(sent_list, axis=0)
def encode_pyfn(text):
[text_encoded] = tf.py_function(encode, inp=[text], Tout=[tf.int32])
return text_encoded
dataset = dataset.map(encode_pyfn).batch(batch_size=2)
next(iter(dataset))
它提出了以下问题:
TypeError: in user code:
<ipython-input-9-30172a796c2e>:69 encode_pyfn *
features['text'] = tf.py_function(encode, inp=features[text], Tout=[tf.int32])
/Users/username/opt/anaconda3/lib/python3.8/site-packages/tensorflow/python/framework/ops.py:823 __hash__
raise TypeError("Tensor is unhashable. "
TypeError: Tensor is unhashable. Instead, use tensor.ref() as the key.
TypeError:在用户代码中:
:69编码_pyfn*
features['text']=tf.py_函数(encode,inp=features[text],Tout=[tf.int32])
/Users/username/opt/anaconda3/lib/python3.8/site packages/tensorflow/python/framework/ops.py:823\uu散列__
raise TypeError(“张量不可损坏。”
TypeError:Tensor不可修改。请改用Tensor.ref()作为键。
将函数应用于单个功能的正确方法是什么?您可能是指今天在这里编写
features['text']
(即添加单引号)inp=features[text]
:)@谢谢您的关注!进行编辑会将错误消息更改为“TypeError:Expected list for'input”参数为“EcenterpyFunc”Op,而不是Tensor(“args_2:0”,shape=(),dtype=string)。`我尝试取消匹配(来自另一个问题的建议),但错误仍然存在。有什么想法吗?inp
应该是一个张量列表,所以它应该是inp=[features['text']]
。此外,由于您没有在这里使用unbatch,因此您正在处理一批样本;因此,请确保在实现encode
函数时考虑了这一点。@scribbles,您的问题现在解决了吗?否则,您可以共享可执行代码来复制您的问题。谢谢您可能是指今天在这里编写features['text']
(即添加单引号)inp=features[text]
:)@谢谢您的关注!进行编辑会将错误消息更改为“TypeError:Expected list for'input”参数为“EcenterpyFunc”Op,而不是Tensor(“args_2:0”,shape=(),dtype=string)。`我尝试取消匹配(来自另一个问题的建议),但错误仍然存在。有什么想法吗?inp
应该是一个张量列表,所以它应该是inp=[features['text']]
。此外,由于您没有在这里使用unbatch,因此您正在处理一批样本;因此,请确保在实现encode
函数时考虑了这一点。@scribbles,您的问题现在解决了吗?否则,您可以共享可执行代码来复制您的问题。谢谢
TypeError: in user code:
<ipython-input-9-30172a796c2e>:69 encode_pyfn *
features['text'] = tf.py_function(encode, inp=features[text], Tout=[tf.int32])
/Users/username/opt/anaconda3/lib/python3.8/site-packages/tensorflow/python/framework/ops.py:823 __hash__
raise TypeError("Tensor is unhashable. "
TypeError: Tensor is unhashable. Instead, use tensor.ref() as the key.