如何在具有共享嵌入层和负采样的keras中实现word2vec CBOW？_Keras_Embedding_Word2vec

如何在具有共享嵌入层和负采样的keras中实现word2vec CBOW？

keras

如何在具有共享嵌入层和负采样的keras中实现word2vec CBOW？,keras,embedding,word2vec,Keras,Embedding,Word2vec,我想创建一个单词嵌入预训练网络，在word2vec CBOW的基础上添加一些东西。因此，我首先尝试实现word2vec CBOW。因为我对CBOW非常陌生，所以我不知道如何在其中实现CBOW 初始化：我计算了词汇量，并将单词映射到整数网络输入（尚未实施）： 2*k+1整数列表（表示上下文中的中心单词和2*k单词）网络规范一个共享的嵌入层应该获取这个整数列表，并给出相应的向量输出。此外，还要取2*k上下文向量的平均值（我相信这可以通过add_节点（层、名称、输入=[2*k向量]、merge

我想创建一个单词嵌入预训练网络，在word2vec CBOW的基础上添加一些东西。因此，我首先尝试实现word2vec CBOW。因为我对CBOW非常陌生，所以我不知道如何在其中实现CBOW

初始化：

我计算了词汇量，并将单词映射到整数

网络输入（尚未实施）：

2*k+1

整数列表（表示上下文中的中心单词和

2*k

单词）

网络规范

一个共享的

嵌入层应该获取这个整数列表，并给出相应的向量输出。此外，还要取2*k
上下文向量的平均值（我相信这可以通过add_节点（层、名称、输入=[2*k向量]、merge_mode='ave'）
实现）
如果任何人都能分享其中的一小段代码，这将非常有帮助
p.S.：我正在查看，但无法遵循其代码，因为它还使用gensim
更新1：
我想在网络中共享嵌入层。嵌入层应该能够获取上下文单词（2*k）和当前单词。我可以通过在输入中获取所有2*k+1单词索引并编写一个自定义lambda函数来实现这一点，该函数将完成所需的工作。但是，在这之后，我还想添加负采样网络，我将需要嵌入更多的单词和与上下文向量的点积。是否有人可以提供一个示例，其中嵌入层是Graph（）
网络中的共享节点
您可以尝试类似的方法。这里我将嵌入矩阵初始化为一个固定值。对于shape（1,6）
的输入数组，您将获得shape（1100）
的输出，其中100
是6个输入嵌入的平均值
model = Sequential()
k = 3 # context windows size
context_size = 2*k
# generate weight matrix for embeddings
embedding = []
for i in range(10):
    embedding.append(np.full(100, i))
embedding = np.array(embedding)
print embedding

model.add(Embedding(input_dim=10, output_dim=100, input_length=context_size, weights=[embedding]))
model.add(Lambda(lambda x: K.mean(x, axis=1), output_shape=(100,)))

model.compile('rmsprop', 'mse')

input_array = np.random.randint(10, size=(1, context_size))
print input_array.shape

output_array = model.predict(input_array)
print output_array.shape
print output_array[0] 

Graph（）
已从keras

可以使用创建任意网络。
下面是创建word2vec cbow模型的演示代码，该模型使用随机输入测试的负采样
from keras import backend as K
import numpy as np
from keras.utils.np_utils import accuracy
from keras.models import Sequential, Model
from keras.layers import Input, Lambda, Dense, merge
from keras.layers.embeddings import Embedding

k = 3 # context windows size
context_size = 2*k
neg = 5 # number of negative samples
# generate weight matrix for embeddings
embedding = []
for i in range(10):
    embedding.append(np.full(100, i))
embedding = np.array(embedding)
print embedding

# Creating CBOW model
word_index = Input(shape=(1,))
context = Input(shape=(context_size,))
negative_samples = Input(shape=(neg,))
shared_embedding_layer = Embedding(input_dim=10, output_dim=100, weights=[embedding])

word_embedding = shared_embedding_layer(word_index)
context_embeddings = shared_embedding_layer(context)
negative_words_embedding = shared_embedding_layer(negative_samples)
cbow = Lambda(lambda x: K.mean(x, axis=1), output_shape=(100,))(context_embeddings)

word_context_product = merge([word_embedding, cbow], mode='dot')
negative_context_product = merge([negative_words_embedding, cbow], mode='dot', concat_axis=-1)

model = Model(input=[word_index, context, negative_samples], output=[word_context_product, negative_context_product])

model.compile(optimizer='rmsprop', loss='mse', metrics=['accuracy'])

input_context = np.random.randint(10, size=(1, context_size))
input_word = np.random.randint(10, size=(1,))
input_negative = np.random.randint(10, size=(1, neg))

print "word, context, negative samples"
print input_word.shape, input_word
print input_context.shape, input_context
print input_negative.shape, input_negative

output_dot_product, output_negative_product = model.predict([input_word, input_context, input_negative])
print "word cbow dot product"
print output_dot_product.shape, output_dot_product
print "cbow negative dot product"
print output_negative_product.shape, output_negative_product

希望有帮助
更新1：
我已经完成了代码并上传了它
我觉得keras.layers.Embedding
withweights
的格式不受欢迎，如果您选中此（）和此（），我觉得keras.layers.Embedding
withweights
的格式不受欢迎，如果您选中此（）和此（）可能是。这是一个更老的答案，并且使用的是去年版本的kerasNo Concers！我提到它只是为了保持StackOverflow及其答案的更新。：）是否有可能使用softmax损耗（在本例中，由于存在负采样，因此采样softmax损耗）而不是mse损耗？我看不出其他损耗不起作用的原因。您应该尝试一下，也许您会看到更好的性能：）