Python 在Tensorflow中使用bigram LSTM模型创建嵌入并对嵌入进行培训_Python_Machine Learning_Tensorflow

Python 在Tensorflow中使用bigram LSTM模型创建嵌入并对嵌入进行培训

python machine-learning tensorflow

Python 在Tensorflow中使用bigram LSTM模型创建嵌入并对嵌入进行培训,python,machine-learning,tensorflow,Python,Machine Learning,Tensorflow,我很难弄清楚如何在Tensorflow中为LSTM创建和训练bigram嵌入我们最初得到的结论是，序列数据是一个形状张量（num\u unrollings，batch\u size，27），即num\u unrollings是批次总数，batch\u size是每个批次的大小，27`是字符“a”到“z”的一个热编码向量的大小，包括“” LSTM在每个时间步将一个批次作为输入，即它采用形状张量（批次大小，27） characters（）是一个函数，它接受形状为27的张量，并从一个热编码返回最可能

我很难弄清楚如何在Tensorflow中为LSTM创建和训练bigram嵌入

我们最初得到的结论是，序列数据是一个形状张量

（num\u unrollings，batch\u size，27），即num\u unrollings是批次总数，
batch\u size是每个批次的大小，
27`是字符“a”到“z”的一个热编码向量的大小，包括“”
LSTM在每个时间步将一个批次作为输入，即它采用形状张量（批次大小，27）

characters（）
是一个函数，它接受形状为27
的张量，并从一个热编码返回最可能的字符
到目前为止，我所做的是为每个bigram创建一个索引查找。我们总共有27*27=729个bigram（因为我包含了“”字符）。我选择用log（729）~10位的向量来表示每个二元RAM
最后，我试图将我对LSTM的输入作为形状的张量（批次大小/2,10）
。这样我就可以在bigrams上训练了
以下是相关代码：
batch_size=64
num_unrollings=10
num_embeddings = 729 
embedding_size = 10

bigram2id = dict()

key = ""

# build dictionary of bigrams and their respective indices: 
for i in range(ord('z') - ord('a') + 2): 
    key = chr(97 + i)
    if (i == 26): 
        key = " "
    for j in range(ord('z')- ord('a') + 2):
        if j == 26: 
            bigram2id[key + " "] = i*27 + j 
            continue
        bigram2id[key + chr(97 + j)] = i*27 + j

graph = tf.Graph() 

with graph.as_default(): 

    # embeddings 
    embeddings = tf.Variable(tf.random_uniform([num_embeddings, embedding_size], -1.0, 1.0), trainable=False)

    """
    1) load the training data as we would normally 
    2) look up the embeddings of the data then from there get the inputs and the labels 
    3) train
    """

    # load training data, labels for both unembedded and embedded data 
    train_data = list()
    embedded_train_data = list()
    for _ in range(num_unrollings + 1):
        train_data.append(tf.placeholder(tf.float32, shape=[batch_size,vocabulary_size]))
        embedded_train_data.append(tf.placeholder(tf.float32, shape=[batch_size / 2, embedding_size]))

    # look up embeddings for training data and labels (make sure to set trainable=False) 
    for batch_ctr in range(num_unrollings + 1): 
        for bigram_ctr in range((batch_size // 2) + 1): 
            # get current bigram
            current_bigram = characters(train_data[batch_ctr][bigram_ctr*2]) + characters(train_data[batch_ctr][bigram_ctr*2 + 1])
            # look up id 
            current_bigram_id = bigram2id[current_bigram]
            # look up embedding
            embedded_bigram = tf.nn.embedding_lookup(embeddings, embedded_bigram)
            # add to current batch 
            embedded_train_data[batch_ctr][bigram_ctr].append(embedded_bigram)

但是现在，我得到的形状（64，27）必须是秩1错误，即使我修复了它，我也不确定我是否采取了正确的方法