Python TensorFlow上的MLP对训练后的所有观察结果给出了相同的预测

Python TensorFlow上的MLP对训练后的所有观察结果给出了相同的预测,python,neural-network,tensorflow,multi-layer,Python,Neural Network,Tensorflow,Multi Layer,我试图用MLP训练稀疏数据来预测预测。然而,对测试数据的预测给出了所有观测值的相同值。一旦我省略了每一层的激活函数,结果就不同了。 我的代码如下: # imports import numpy as np import tensorflow as tf import random import json from scipy.sparse import rand # Parameters learning_rate= 0.1 training_epochs = 50 batch_size

我试图用MLP训练稀疏数据来预测预测。然而,对测试数据的预测给出了所有观测值的相同值。一旦我省略了每一层的激活函数,结果就不同了。 我的代码如下:

# imports
import numpy as np
import tensorflow as tf
import random
import json
from scipy.sparse import rand


# Parameters
learning_rate= 0.1 
training_epochs = 50
batch_size = 100

# Network Parameters
m= 1000 #number of features
n= 5000 # number of observations

hidden_layers = [5,2,4,1,6]
n_layers = len(hidden_layers)
n_input =  m 
n_classes = 1 # it's a regression problem

X_train = rand(n, m, density=0.2,format = 'csr').todense().astype(np.float32)
Y_train =  np.random.randint(4, size=n)


X_test = rand(200, m, density=0.2,format = 'csr').todense().astype(np.float32)
Y_test =  np.random.randint(4, size=200)

# tf Graph input
x = tf.placeholder("float", [None, n_input])
y = tf.placeholder("float", [None])


# Store layers weight & bias
weights = {}
biases = {}
weights['h1']=tf.Variable(tf.random_normal([n_input,    hidden_layers[0]])) #first matrice
biases['b1'] = tf.Variable(tf.random_normal([hidden_layers[0]]))

for i in xrange(2,n_layers+1):
    weights['h'+str(i)]=   tf.Variable(tf.random_normal([hidden_layers[i-2], hidden_layers[i-1]]))
    biases['b'+str(i)] = tf.Variable(tf.random_normal([hidden_layers[i-1]]))

weights['out']=tf.Variable(tf.random_normal([hidden_layers[-1], 1]))   #matrice between last layer and output
biases['out']= tf.Variable(tf.random_normal([1]))


# Create model
def multilayer_perceptron(_X, _weights, _biases):
    layer_begin = tf.nn.relu(tf.add(tf.matmul(_X, _weights['h1'],a_is_sparse=True), _biases['b1']))

    for layer in xrange(2,n_layers+1):
        layer_begin = tf.nn.relu(tf.add(tf.matmul(layer_begin, _weights['h'+str(layer)]), _biases['b'+str(layer)]))
        #layer_end = tf.nn.dropout(layer_begin, 0.3)

    return tf.matmul(layer_begin, _weights['out'])+ _biases['out']


# Construct model
pred = multilayer_perceptron(x, weights, biases)



# Define loss and optimizer
rmse = tf.reduce_sum(tf.abs(y-pred))/tf.reduce_sum(tf.abs(y)) # rmse loss
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(rmse) # Adam Optimizer

# Initializing the variables
init = tf.initialize_all_variables()

with tf.Session() as sess:
    sess.run(init)

    #training
    for step in xrange(training_epochs):

        # Generate a minibatch.
        start = random.randrange(1, n - batch_size)
        #print start
        batch_xs=X_train[start:start+batch_size,:]
        batch_ys =Y_train[start:start+batch_size]

        #printing
        _,rmseRes = sess.run([optimizer, rmse] , feed_dict={x: batch_xs, y: batch_ys} )
        if step % 20 == 0:
             print "rmse [%s] = %s" % (step, rmseRes)


    #testing
    pred_test = multilayer_perceptron(X_test, weights, biases)
    print "prediction", pred_test.eval()[:20] 
    print  "actual = ", Y_test[:20]

PS:我随机生成数据只是为了重现错误。我的数据实际上是稀疏的,与随机生成的数据非常相似。我想解决的问题是:MLP对测试数据中的所有观察结果都给出了相同的预测。

这是训练失败的迹象。在GoogeLeNet Imagenet培训中,我看到它在开始时用错误的超参数选择将一切标记为“线虫”。需要检查的事情——你的训练损失减少了吗?如果没有降低,尝试不同的学习速率/架构。如果它减少到零,也许你的损失是错误的,就像是案例

也许这是我自己对问题的无知,但是当训练完全随机的数据时,期望MLP收敛是合理的吗?即使是这样,在一个单独的、随机生成的测试集上,期望学习到的参数比随机生成的测试集产生更好的准确性是否合理?谢谢。事实上,你说的有道理。但是,我在这里随机生成数据来重现错误。这不是我的真实数据,但它很接近,很稀疏。那么,问题不在于神经网络是否收敛,问题在于“同一预测”对于所有观测值都是测试数据。谢谢。是的,我尝试了几个超参数(学习速率/层数/每层神经元数)。有时我得到不同的预测(“小型”网络),大多数情况下要么是相同的预测,要么是远离现实的预测。损失函数缓慢下降(有时增加,根本不稳定)。我将尝试您推荐的其他丢失功能。