Python 两层神经网络的精度没有提高
我是tensorflow的新手,正在制作我的第一个两层神经网络。我正在利用UCI的心脏病数据集Python 两层神经网络的精度没有提高,python,tensorflow,machine-learning,neural-network,deep-learning,Python,Tensorflow,Machine Learning,Neural Network,Deep Learning,我是tensorflow的新手,正在制作我的第一个两层神经网络。我正在利用UCI的心脏病数据集 import tensorflow as tf import numpy as np import pandas as pd from sklearn.model_selection import train_test_split RANDOM_SEED = 41 tf.set_random_seed(RANDOM_SEED) def init_weights(shape): """ We
import tensorflow as tf
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
RANDOM_SEED = 41
tf.set_random_seed(RANDOM_SEED)
def init_weights(shape):
""" Weight initialization """
weights = tf.random_normal(shape, stddev=0.1)
return tf.Variable(weights)
def forwardprop(X, w_1, w_2, w_3):
h_1 = tf.nn.sigmoid(tf.matmul(X, w_1))
h_2 = tf.nn.sigmoid(tf.matmul(h_1, w_2))
yhat = tf.nn.sigmoid(tf.matmul(h_2, w_3))
return yhat
def get_heart_data():
disease = pd.read_csv('../data/disease.csv')
disease.replace(to_replace="?", value = "u", inplace = True)
disease = pd.get_dummies(disease, columns=['ca', 'thal', 'fbs', 'exang', 'slop', 'sex', 'cp'], drop_first=True)
all_X = disease.drop(['pred_attribute'],1)
all_y = disease['pred_attribute']
all_y = pd.get_dummies(all_y, columns=['pred_attribute'], drop_first=False)
return train_test_split(all_X, all_y, test_size=0.3, random_state=RANDOM_SEED)
def main():
train_X, test_X, train_y, test_y = get_heart_data()
# Layer's sizes
x_size = 21
h_1_size = 154
h_2_size = 79
y_size = 5
# Symbols
X = tf.placeholder("float", shape=[None, x_size])
y = tf.placeholder("float", shape=[None, y_size])
# Weight initializations
w_1 = init_weights((x_size, h_1_size))
w_2 = init_weights((h_1_size, h_2_size))
w_3 = init_weights((h_2_size, y_size))
# Forward propagation
logits = forwardprop(X, w_1, w_2, w_3)
# Backward propagation
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=logits))
updates = tf.train.GradientDescentOptimizer(0.01).minimize(cost)
# Run SGD
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
for epoch in range(100):
# Train with each example
for i in range(len(train_X)):
sess.run(updates, feed_dict={X: train_X, y: train_y })
pred = tf.nn.softmax(logits) # Apply softmax to logits
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
training_accuracy = sess.run(accuracy, feed_dict={X: train_X, y: train_y})
testing_accuracy = sess.run(accuracy, feed_dict={X: test_X, y: test_y})
print("Epoch = %d, train accuracy = %.2f%%, test accuracy = %.2f%%"
% (epoch + 1, 100 * training_accuracy, 100. * testing_accuracy))
sess.close()
main()
我认为我已经正确地设置了一切,但是当我运行程序时,它只是一次又一次地重复给我同样的准确性
Epoch = 1, train accuracy = 55.19%, test accuracy = 51.65%
Epoch = 2, train accuracy = 55.19%, test accuracy = 51.65%
Epoch = 3, train accuracy = 55.19%, test accuracy = 51.65%
Epoch = 4, train accuracy = 55.19%, test accuracy = 51.65%
Epoch = 5, train accuracy = 55.19%, test accuracy = 51.65%
Epoch = 6, train accuracy = 55.19%, test accuracy = 51.65%
Epoch = 7, train accuracy = 55.19%, test accuracy = 51.65%
Epoch = 8, train accuracy = 55.19%, test accuracy = 51.65%
Epoch = 9, train accuracy = 55.19%, test accuracy = 51.65%
Epoch = 10, train accuracy = 55.19%, test accuracy = 51.65%
这将持续到100世纪。我试着乘以100000,看看它是否变化很小,但每次都保持不变。我不知道这是我的网络还是我的精度函数还是别的什么。非常感谢您的帮助,
-Matt你的测试和训练集在每个时代都保持不变,那么为什么你会期望不同的结果呢 我认为您想要的是从epoch 1开始,只使用少量样本,然后通过epoch添加更多样本:
n_epoch = 100
# Assuming of course that you have more than n_epoch samples in each of your sets
trainSamplesByEpoch = int(len(train_X) / n_epoch)
for epoch in range(1,100):
train_X_current = train_X[0:epoch*trainSamplesByEpoch]
train_y_current = train_y[0:epoch*trainSamplesByEpoch]
# Train your network with train_X_current, train_y_current
# Compute the train accuracy with train_X_current, train_y_current
# Compute the test accuracy with test_X, test_y
以这种方式使用历元是为了知道有多少样本大致足以达到所需的性能。使用所有样本可能会导致模型过度拟合。一个可能的问题是构建网络的方式。 您在任何地方都在使用非线性,甚至是输出层 您正在使用的损失,
tf.nn.softmax\u cross\u entropy\u with\u logits()
要求logits是一个线性函数。因此,以下是构建人际网络的方法:
def forwardprop(X, w_1, w_2, w_3):
h_1 = tf.nn.sigmoid(tf.matmul(X, w_1))
h_2 = tf.nn.sigmoid(tf.matmul(h_1, w_2))
yhat = tf.matmul(h_2, w_3)
return yhat
当您转向更复杂的网络时,您构建代码的方式也会给您带来问题。
您不应该想到
forwardprop()
和backprop()
。在编写tensorflow代码时,请将其视为指定了一个图形以及必要的计算。看一看,关于如何构造代码的黄金标准。说“使用train_X_current和test_X_current进行处理”是指在精度计算或网络训练中。另外,你说我的训练集保持在同一个历元,因此精度应该保持不变,但我的印象是,随着通过反向传播调整权重,精度会发生变化。“我希望计算每个历元的精度。”马修凯西我更正并完成了我的代码,并添加了训练和计算的方法。反向传播是一种单次运行的算法,在相同的初始条件下多次运行会产生相同的结果。我不知道这是否有助于解决问题,但您构建网络的方式是一个问题。从最后一层去除sigmoid非线性。softmax需要登录才能正常工作。如果你把它推过乙状结肠,它就不能正常工作def forwardprop(X,w_1,w_2,w_3):h_1=tf.nn.sigmoid(tf.matmul(X,w_1))h_2=tf.nn.sigmoid(tf.matmul(h_1,w_2))yhat=tf.matmul(h_2,w_3)返回yhat`