Tensorflow 用张量流实现LSTM回归模型

Tensorflow 用张量流实现LSTM回归模型,tensorflow,regression,lstm,Tensorflow,Regression,Lstm,我试图实现一个张量流LSTM回归模型,用于输入数字列表。 例如: input_data = [1, 2, 3, 4, 5] time_steps = 2 -> X == [[1, 2], [2, 3], [3, 4]] -> y == [3, 4, 5] 代码如下: TIMESTEPS = 20 num_hidden=20 Xd, yd = load_data() train_input = Xd['train'] train_input = train_i

我试图实现一个张量流LSTM回归模型,用于输入数字列表。 例如:

 input_data = [1, 2, 3, 4, 5]
 time_steps = 2
    -> X == [[1, 2], [2, 3], [3, 4]]
    -> y == [3, 4, 5]
代码如下:

TIMESTEPS = 20
num_hidden=20

Xd, yd = load_data()

train_input = Xd['train']
train_input = train_input.reshape(-1,20,1)
train_output = yd['train']

# train_input = [[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20],..
# train_output  = [[21],[22],[23]....

test_input = Xd['test']
test_output = yd['test']

X = tf.placeholder(tf.float32, [None, 20, 1])
y = tf.placeholder(tf.float32, [None, 1])

cell = tf.nn.rnn_cell.LSTMCell(num_hidden, state_is_tuple=True)

val, state = tf.nn.dynamic_rnn(cell, X, dtype=tf.float32)
val = tf.Print(val, [tf.argmax(val,1)], 'argmax(val)=' , summarize=20, first_n=7)

val = tf.transpose(val, [1, 0, 2])
val = tf.Print(val, [tf.argmax(val,1)], 'argmax(val2)=' , summarize=20, first_n=7)

# Take only the last output after 20 time steps
last = tf.gather(val, int(val.get_shape()[0]) - 1)
last = tf.Print(last, [tf.argmax(last,1)], 'argmax(val3)=' , summarize=20, first_n=7)

# define variables for weights and bias
weight = tf.Variable(tf.truncated_normal([num_hidden, int(y.get_shape()[1])]))
bias = tf.Variable(tf.constant(0.1, shape=[y.get_shape()[1]]))

# Prediction is matmul of last value + wieght + bias
prediction = tf.matmul(last, weight) + bias

# Cost function using softmax
# y is the true distrubution and prediction is the predicted
cost = tf.reduce_mean(-tf.reduce_sum(y * tf.log(prediction), reduction_indices=[1]))
#cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=y))

optimizer = tf.train.AdamOptimizer()
minimize = optimizer.minimize(cost)

from tensorflow.python import debug as tf_debug
inita = tf.initialize_all_variables()
sess = tf.Session()
sess.run(inita)

batch_size = 100
no_of_batches = int(len(train_input)/batch_size)
epoch = 10
test_size = 100
for i in range(epoch):
    for start, end in zip(range(0, len(train_input), batch_size), range(batch_size, len(train_input)+1, batch_size)):
        sess.run(minimize, feed_dict={X: train_input[start:end], y: train_output[start:end]})

    test_indices = np.arange(len(test_input))  # Get A Test Batch
    np.random.shuffle(test_indices)
    test_indices = test_indices[0:test_size]
    print (i, mean_squared_error(np.argmax(test_output[test_indices], axis=1), sess.run(prediction, feed_dict={X: test_input[test_indices]})))

print ("predictions", prediction.eval(feed_dict={X: train_input}, session=sess))
y_pred = prediction.eval(feed_dict={X: test_input}, session=sess)
sess.close()
test_size = test_output.shape[0]
ax = np.arange(0, test_size, 1)
plt.plot(ax, test_output, 'r', ax, y_pred, 'b')
plt.show()
但我无法最小化成本,计算的MSE在每一步都会增加,而不是减少。 我怀疑我使用的成本问题有问题

关于我做错了什么,有什么想法或建议吗


谢谢

如评论中所述,您必须将损失函数更改为MSE函数并降低学习率。你的误差收敛到零了吗?

试着用均方误差代替交叉熵,你没有在这里做分类。成本=0.5*tf.平方(y-预测);成本=tf。降低平均值(成本)嗨,安东尼,谢谢你的投入。我试着按照您的建议使用MSE作为成本,但每个历元后计算的MSE误差仍在增加。我明白了。你用Adam优化器试过更小的学习率吗?我试过默认的0.1,然后是0.05,0.01,0.001。但还是一样的行为。你认为代码总体上是可以使用的吗?嗨,Anthony,我重复了我的测试,你的建议真的很有帮助。现在,MSE不是爆炸性的(持续增加),而是波动的,最终的图显示预测与测试数据相当一致。谢谢。有没有办法将您的评论作为答案?我现在已经使用正弦函数测试了一个玩具示例数据集。MSE在小范围内波动(在本例中介于0.6和0.5之间),但从不低于该值。你认为像正弦函数这样的简单情况应该收敛到0吗?