Python 使用tensorflow提取ELMo特征并将其转换为numpy_Python_Numpy_Tensorflow_Feature Extraction_Tensorflow Hub

Python 使用tensorflow提取ELMo特征并将其转换为numpy

python numpy tensorflow

Python 使用tensorflow提取ELMo特征并将其转换为numpy,python,numpy,tensorflow,feature-extraction,tensorflow-hub,Python,Numpy,Tensorflow,Feature Extraction,Tensorflow Hub,所以我有兴趣使用ELMo模型提取句子嵌入我一开始试过这个： import tensorflow as tf import tensorflow_hub as hub import numpy as np elmo_model = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True) x = ["Hi my friend"] embeddings = elmo_model(x, s

所以我有兴趣使用ELMo模型提取句子嵌入

我一开始试过这个：

import tensorflow as tf
import tensorflow_hub as hub
import numpy as np

elmo_model = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True)

x = ["Hi my friend"]

embeddings = elmo_model(x, signature="default", as_dict=True)["elmo"]


print(embeddings.shape)
print(embeddings.numpy())

在最后一行之前，它运行良好，我无法将其转换为numpy数组

我搜索了一下，发现如果我在代码的开头加上下面一行，问题就必须解决了

tf.enable_eager_execution()

然而，在我的代码开始时，我意识到我无法编译

elmo_model = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True)

我收到了这个错误：

我怎样才能解决我的问题？我的目标是获得句子特征并在NumPy数组中使用它们

提前感谢

TF 2.x TF2行为更接近于经典的python行为，因为它默认为急切执行。但是，应该使用

hub.load

在TF2中加载模型

elmo = hub.load("https://tfhub.dev/google/elmo/2").signature["default"]
x = ["Hi my friend"]
embeddings = elmo(tf.constant(x))["elmo"]

然后，您可以使用

numpy

方法访问结果并将其转换为numpy数组

>>> embeddings.numpy()
array([[[-0.7205108 , -0.27990735, -0.7735629 , ..., -0.24703965,
         -0.8358178 , -0.1974785 ],
        [ 0.18500198, -0.12270843, -0.35163105, ...,  0.14234722,
          0.08479916, -0.11709933],
        [-0.49985904, -0.88964033, -0.30124515, ...,  0.15846594,
          0.05210422,  0.25386307]]], dtype=float32)

TF1.x 如果使用tf1.x，则应在

TF.Session

中运行该操作。TensorFlow不使用急切执行，需要先构建图，然后在会话中评估结果

elmo_model = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True)
x = ["Hi my friend"]
embeddings_op = elmo_model(x, signature="default", as_dict=True)["elmo"]
# required to load the weights into the graph
init_op = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init_op)
    embeddings = sess.run(embeddings_op)

在这种情况下，结果将已经是一个numpy数组：

>>> embeddings
array([[[-0.72051036, -0.27990723, -0.773563  , ..., -0.24703972,
         -0.83581805, -0.19747877],
        [ 0.18500218, -0.12270836, -0.35163072, ...,  0.14234722,
          0.08479934, -0.11709933],
        [-0.49985906, -0.8896401 , -0.3012453 , ...,  0.15846589,
          0.05210405,  0.2538631 ]]], dtype=float32)

TF2.x TF2行为更接近于经典的python行为，因为它默认为急切执行。但是，应该使用

hub.load