Python TensorFlow上的MLP对训练后的所有观察结果给出了相同的预测
我试图用MLP训练稀疏数据来预测预测。然而,对测试数据的预测给出了所有观测值的相同值。一旦我省略了每一层的激活函数,结果就不同了。 我的代码如下:Python TensorFlow上的MLP对训练后的所有观察结果给出了相同的预测,python,neural-network,tensorflow,multi-layer,Python,Neural Network,Tensorflow,Multi Layer,我试图用MLP训练稀疏数据来预测预测。然而,对测试数据的预测给出了所有观测值的相同值。一旦我省略了每一层的激活函数,结果就不同了。 我的代码如下: # imports import numpy as np import tensorflow as tf import random import json from scipy.sparse import rand # Parameters learning_rate= 0.1 training_epochs = 50 batch_size
# imports
import numpy as np
import tensorflow as tf
import random
import json
from scipy.sparse import rand
# Parameters
learning_rate= 0.1
training_epochs = 50
batch_size = 100
# Network Parameters
m= 1000 #number of features
n= 5000 # number of observations
hidden_layers = [5,2,4,1,6]
n_layers = len(hidden_layers)
n_input = m
n_classes = 1 # it's a regression problem
X_train = rand(n, m, density=0.2,format = 'csr').todense().astype(np.float32)
Y_train = np.random.randint(4, size=n)
X_test = rand(200, m, density=0.2,format = 'csr').todense().astype(np.float32)
Y_test = np.random.randint(4, size=200)
# tf Graph input
x = tf.placeholder("float", [None, n_input])
y = tf.placeholder("float", [None])
# Store layers weight & bias
weights = {}
biases = {}
weights['h1']=tf.Variable(tf.random_normal([n_input, hidden_layers[0]])) #first matrice
biases['b1'] = tf.Variable(tf.random_normal([hidden_layers[0]]))
for i in xrange(2,n_layers+1):
weights['h'+str(i)]= tf.Variable(tf.random_normal([hidden_layers[i-2], hidden_layers[i-1]]))
biases['b'+str(i)] = tf.Variable(tf.random_normal([hidden_layers[i-1]]))
weights['out']=tf.Variable(tf.random_normal([hidden_layers[-1], 1])) #matrice between last layer and output
biases['out']= tf.Variable(tf.random_normal([1]))
# Create model
def multilayer_perceptron(_X, _weights, _biases):
layer_begin = tf.nn.relu(tf.add(tf.matmul(_X, _weights['h1'],a_is_sparse=True), _biases['b1']))
for layer in xrange(2,n_layers+1):
layer_begin = tf.nn.relu(tf.add(tf.matmul(layer_begin, _weights['h'+str(layer)]), _biases['b'+str(layer)]))
#layer_end = tf.nn.dropout(layer_begin, 0.3)
return tf.matmul(layer_begin, _weights['out'])+ _biases['out']
# Construct model
pred = multilayer_perceptron(x, weights, biases)
# Define loss and optimizer
rmse = tf.reduce_sum(tf.abs(y-pred))/tf.reduce_sum(tf.abs(y)) # rmse loss
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(rmse) # Adam Optimizer
# Initializing the variables
init = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init)
#training
for step in xrange(training_epochs):
# Generate a minibatch.
start = random.randrange(1, n - batch_size)
#print start
batch_xs=X_train[start:start+batch_size,:]
batch_ys =Y_train[start:start+batch_size]
#printing
_,rmseRes = sess.run([optimizer, rmse] , feed_dict={x: batch_xs, y: batch_ys} )
if step % 20 == 0:
print "rmse [%s] = %s" % (step, rmseRes)
#testing
pred_test = multilayer_perceptron(X_test, weights, biases)
print "prediction", pred_test.eval()[:20]
print "actual = ", Y_test[:20]
PS:我随机生成数据只是为了重现错误。我的数据实际上是稀疏的,与随机生成的数据非常相似。我想解决的问题是:MLP对测试数据中的所有观察结果都给出了相同的预测。这是训练失败的迹象。在GoogeLeNet Imagenet培训中,我看到它在开始时用错误的超参数选择将一切标记为“线虫”。需要检查的事情——你的训练损失减少了吗?如果没有降低,尝试不同的学习速率/架构。如果它减少到零,也许你的损失是错误的,就像是案例也许这是我自己对问题的无知,但是当训练完全随机的数据时,期望MLP收敛是合理的吗?即使是这样,在一个单独的、随机生成的测试集上,期望学习到的参数比随机生成的测试集产生更好的准确性是否合理?谢谢。事实上,你说的有道理。但是,我在这里随机生成数据来重现错误。这不是我的真实数据,但它很接近,很稀疏。那么,问题不在于神经网络是否收敛,问题在于“同一预测”对于所有观测值都是测试数据。谢谢。是的,我尝试了几个超参数(学习速率/层数/每层神经元数)。有时我得到不同的预测(“小型”网络),大多数情况下要么是相同的预测,要么是远离现实的预测。损失函数缓慢下降(有时增加,根本不稳定)。我将尝试您推荐的其他丢失功能。