Python 在Tensorflow中使用bigram LSTM模型创建嵌入并对嵌入进行培训
我很难弄清楚如何在Tensorflow中为LSTM创建和训练bigram嵌入 我们最初得到的结论是,序列数据是一个形状张量Python 在Tensorflow中使用bigram LSTM模型创建嵌入并对嵌入进行培训,python,machine-learning,tensorflow,Python,Machine Learning,Tensorflow,我很难弄清楚如何在Tensorflow中为LSTM创建和训练bigram嵌入 我们最初得到的结论是,序列数据是一个形状张量(num\u unrollings,batch\u size,27),即num\u unrollings是批次总数,batch\u size是每个批次的大小,27`是字符“a”到“z”的一个热编码向量的大小,包括“” LSTM在每个时间步将一个批次作为输入,即它采用形状张量(批次大小,27) characters()是一个函数,它接受形状为27的张量,并从一个热编码返回最可能
(num\u unrollings,batch\u size,27),即num\u unrollings是批次总数,
batch\u size是每个批次的大小,
27`是字符“a”到“z”的一个热编码向量的大小,包括“”
LSTM在每个时间步将一个批次作为输入,即它采用形状张量(批次大小,27)
characters()
是一个函数,它接受形状为27
的张量,并从一个热编码返回最可能的字符
到目前为止,我所做的是为每个bigram创建一个索引查找。我们总共有27*27=729个bigram(因为我包含了“”字符)。我选择用log(729)~10位的向量来表示每个二元RAM
最后,我试图将我对LSTM的输入作为形状的张量(批次大小/2,10)
。这样我就可以在bigrams上训练了
以下是相关代码:
batch_size=64
num_unrollings=10
num_embeddings = 729
embedding_size = 10
bigram2id = dict()
key = ""
# build dictionary of bigrams and their respective indices:
for i in range(ord('z') - ord('a') + 2):
key = chr(97 + i)
if (i == 26):
key = " "
for j in range(ord('z')- ord('a') + 2):
if j == 26:
bigram2id[key + " "] = i*27 + j
continue
bigram2id[key + chr(97 + j)] = i*27 + j
graph = tf.Graph()
with graph.as_default():
# embeddings
embeddings = tf.Variable(tf.random_uniform([num_embeddings, embedding_size], -1.0, 1.0), trainable=False)
"""
1) load the training data as we would normally
2) look up the embeddings of the data then from there get the inputs and the labels
3) train
"""
# load training data, labels for both unembedded and embedded data
train_data = list()
embedded_train_data = list()
for _ in range(num_unrollings + 1):
train_data.append(tf.placeholder(tf.float32, shape=[batch_size,vocabulary_size]))
embedded_train_data.append(tf.placeholder(tf.float32, shape=[batch_size / 2, embedding_size]))
# look up embeddings for training data and labels (make sure to set trainable=False)
for batch_ctr in range(num_unrollings + 1):
for bigram_ctr in range((batch_size // 2) + 1):
# get current bigram
current_bigram = characters(train_data[batch_ctr][bigram_ctr*2]) + characters(train_data[batch_ctr][bigram_ctr*2 + 1])
# look up id
current_bigram_id = bigram2id[current_bigram]
# look up embedding
embedded_bigram = tf.nn.embedding_lookup(embeddings, embedded_bigram)
# add to current batch
embedded_train_data[batch_ctr][bigram_ctr].append(embedded_bigram)
但是现在,我得到的形状(64,27)必须是秩1错误,即使我修复了它,我也不确定我是否采取了正确的方法