Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/292.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 在Tensorflow中使用bigram LSTM模型创建嵌入并对嵌入进行培训_Python_Machine Learning_Tensorflow - Fatal编程技术网

Python 在Tensorflow中使用bigram LSTM模型创建嵌入并对嵌入进行培训

Python 在Tensorflow中使用bigram LSTM模型创建嵌入并对嵌入进行培训,python,machine-learning,tensorflow,Python,Machine Learning,Tensorflow,我很难弄清楚如何在Tensorflow中为LSTM创建和训练bigram嵌入 我们最初得到的结论是,序列数据是一个形状张量(num\u unrollings,batch\u size,27),即num\u unrollings是批次总数,batch\u size是每个批次的大小,27`是字符“a”到“z”的一个热编码向量的大小,包括“” LSTM在每个时间步将一个批次作为输入,即它采用形状张量(批次大小,27) characters()是一个函数,它接受形状为27的张量,并从一个热编码返回最可能

我很难弄清楚如何在Tensorflow中为LSTM创建和训练bigram嵌入

我们最初得到的结论是,序列数据是一个形状张量
(num\u unrollings,batch\u size,27),即
num\u unrollings
是批次总数,
batch\u size
是每个批次的大小,
27`是字符“a”到“z”的一个热编码向量的大小,包括“”

LSTM在每个时间步将一个批次作为输入,即它采用形状张量
(批次大小,27)

characters()
是一个函数,它接受形状为
27
的张量,并从一个热编码返回最可能的字符

到目前为止,我所做的是为每个bigram创建一个索引查找。我们总共有27*27=729个bigram(因为我包含了“”字符)。我选择用log(729)~10位的向量来表示每个二元RAM

最后,我试图将我对LSTM的输入作为形状的张量
(批次大小/2,10)
。这样我就可以在bigrams上训练了

以下是相关代码:

batch_size=64
num_unrollings=10
num_embeddings = 729 
embedding_size = 10

bigram2id = dict()

key = ""

# build dictionary of bigrams and their respective indices: 
for i in range(ord('z') - ord('a') + 2): 
    key = chr(97 + i)
    if (i == 26): 
        key = " "
    for j in range(ord('z')- ord('a') + 2):
        if j == 26: 
            bigram2id[key + " "] = i*27 + j 
            continue
        bigram2id[key + chr(97 + j)] = i*27 + j

graph = tf.Graph() 

with graph.as_default(): 

    # embeddings 
    embeddings = tf.Variable(tf.random_uniform([num_embeddings, embedding_size], -1.0, 1.0), trainable=False)

    """
    1) load the training data as we would normally 
    2) look up the embeddings of the data then from there get the inputs and the labels 
    3) train
    """

    # load training data, labels for both unembedded and embedded data 
    train_data = list()
    embedded_train_data = list()
    for _ in range(num_unrollings + 1):
        train_data.append(tf.placeholder(tf.float32, shape=[batch_size,vocabulary_size]))
        embedded_train_data.append(tf.placeholder(tf.float32, shape=[batch_size / 2, embedding_size]))

    # look up embeddings for training data and labels (make sure to set trainable=False) 
    for batch_ctr in range(num_unrollings + 1): 
        for bigram_ctr in range((batch_size // 2) + 1): 
            # get current bigram
            current_bigram = characters(train_data[batch_ctr][bigram_ctr*2]) + characters(train_data[batch_ctr][bigram_ctr*2 + 1])
            # look up id 
            current_bigram_id = bigram2id[current_bigram]
            # look up embedding
            embedded_bigram = tf.nn.embedding_lookup(embeddings, embedded_bigram)
            # add to current batch 
            embedded_train_data[batch_ctr][bigram_ctr].append(embedded_bigram)
但是现在,我得到的形状(64,27)必须是秩1错误,即使我修复了它,我也不确定我是否采取了正确的方法