从tensorflow hub加载wiki40b嵌入时出错
我正在尝试使用此模块()加载带有从tensorflow hub加载wiki40b嵌入时出错,tensorflow,word-embedding,tensorflow-hub,Tensorflow,Word Embedding,Tensorflow Hub,我正在尝试使用此模块()加载带有KerasLayer,但不确定产生此错误的原因 import tensorflow as tf import tensorflow_hub as hub import tensorflow_text hub_url = "https://tfhub.dev/google/wiki40b-lm-nl/1" embed = hub.KerasLayer(hub_url, input_shape=[], dt
KerasLayer
,但不确定产生此错误的原因
import tensorflow as tf
import tensorflow_hub as hub
import tensorflow_text
hub_url = "https://tfhub.dev/google/wiki40b-lm-nl/1"
embed = hub.KerasLayer(hub_url, input_shape=[],
dtype=tf.string)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-37-4e8ab0d5082c> in <module>()
5 hub_url = "https://tfhub.dev/google/wiki40b-lm-nl/1"
6 embed = hub.KerasLayer(hub_url, input_shape=[],
----> 7 dtype=tf.string)
1 frames
/usr/local/lib/python3.7/dist-packages/tensorflow_hub/keras_layer.py in _get_callable(self)
300 if self._signature not in self._func.signatures:
301 raise ValueError("Unknown signature %s in %s (available signatures: %s)."
--> 302 % (self._signature, self._handle, self._func.signatures))
303 f = self._func.signatures[self._signature]
304 if not callable(f):
ValueError: Unknown signature default in https://tfhub.dev/google/wiki40b-lm-nl/1 (available signatures: _SignatureMap({'neg_log_likelihood': <ConcreteFunction pruned(text) at 0x7F3044A93210>, 'tokenization': <ConcreteFunction pruned(text) at 0x7F3040B7D190>, 'token_neg_log_likelihood': <ConcreteFunction pruned(token) at 0x7F3040D14810>, 'word_embeddings': <ConcreteFunction pruned(text) at 0x7F303D3FF2D0>, 'activations': <ConcreteFunction pruned(text) at 0x7F303D3FFF50>, 'prediction': <ConcreteFunction pruned(mem_4, mem_5, mem_6, mem_7, mem_8, mem_9, mem_10, mem_11, input_tokens, mem_0, mem_1, mem_2, mem_3) at 0x7F303C189090>, 'detokenization': <ConcreteFunction pruned(token_ids) at 0x7F3039860790>, 'token_word_embeddings': <ConcreteFunction pruned(token) at 0x7F3038FC2110>, 'token_activations': <ConcreteFunction pruned(token) at 0x7F303BAF9150>})).
我的问题是,如何使用
str
作为输入的嵌入,正如他们在模块页面(输入部分)中指出的那样?将封装在tf.constant
中的文本传递到embed()
并设置output\u键应该可以工作:
import tensorflow as tf
import tensorflow_hub as hub
import tensorflow_text
embed = hub.KerasLayer("https://tfhub.dev/google/wiki40b-lm-nl/1",
signature="word_embeddings",
output_key="word_embeddings")
embed(tf.constant("\n_START_ARTICLE_\n1001 vrouwen uit de Nederlandse "
"geschiedenis\n_START_SECTION_\nSelectie van vrouwen"
"\n_START_PARAGRAPH_\nDe 'oudste' biografie in het boek "
"is gewijd aan de beschermheilige"))
(使用TF 2.4.1和tensorflow_hub 0.11.0进行测试)谢谢。我不是这种嵌入的专家,但在玩它的时候,我注意到每个输入文本都会产生一个三维输出,在第二维度上有不同的形状,这取决于标记的数量。很抱歉提出另一个问题,但是将嵌入平均为每个输入文本嵌入一行embed_dim是否不合适?
import tensorflow as tf
import tensorflow_hub as hub
import tensorflow_text
embed = hub.KerasLayer("https://tfhub.dev/google/wiki40b-lm-nl/1",
signature="word_embeddings",
output_key="word_embeddings")
embed(tf.constant("\n_START_ARTICLE_\n1001 vrouwen uit de Nederlandse "
"geschiedenis\n_START_SECTION_\nSelectie van vrouwen"
"\n_START_PARAGRAPH_\nDe 'oudste' biografie in het boek "
"is gewijd aan de beschermheilige"))