Tensorflow 用LSTM代替全连通层的卷积神经网络
我正在尝试为棒球建立一个神经网络,当一个球穿过板时,它可以检测球在哪里以及击球区在哪里,但我的神经网络似乎陷入了局部极小值,它为数据集中的每个项目返回相同的值 我的方法是在球接近篮板时拍摄34帧,并使用该帧来检测球何时越过篮板 输出为球左、球上、球宽、打击区左、打击区上、打击区宽、打击区高、球穿过板的帧 我的神经网络模型是对每一帧进行卷积,但不是在最后使用一个完全连接的层,而是使用一个LSTM,这样神经网络就可以从以前的帧中推断出事情。我需要从之前的画面中推断,因为有时球不可见,因为投手在它前面,或者因为它在捕手手套里 从一段时间内的成本来看,神经网络似乎陷入了局部极小值,并对训练集中的每个音调产生相同的结果 这是我的密码Tensorflow 用LSTM代替全连通层的卷积神经网络,tensorflow,deep-learning,conv-neural-network,lstm,Tensorflow,Deep Learning,Conv Neural Network,Lstm,我正在尝试为棒球建立一个神经网络,当一个球穿过板时,它可以检测球在哪里以及击球区在哪里,但我的神经网络似乎陷入了局部极小值,它为数据集中的每个项目返回相同的值 我的方法是在球接近篮板时拍摄34帧,并使用该帧来检测球何时越过篮板 输出为球左、球上、球宽、打击区左、打击区上、打击区宽、打击区高、球穿过板的帧 我的神经网络模型是对每一帧进行卷积,但不是在最后使用一个完全连接的层,而是使用一个LSTM,这样神经网络就可以从以前的帧中推断出事情。我需要从之前的画面中推断,因为有时球不可见,因为投手在它前面
filter_size1 = 5 # Convolution filters are 5 x 5 pixels.
num_filters1 = 16 # There are 16 of these filters.
filter_size2 = 5 # Convolution filters are 5 x 5 pixels.
num_filters2 = 36 # There are 36 of these filters.
filter_size3 = 5 # Convolution filters are 5 x 5 pixels.
num_filters3 = 36 # There are 36 of these filters.
num_hidden = 256
lstm_layers = 2
num_channels = 1
num_classes = 10
width = 320
height = 180
sequence_length = Directories.Pitch_Sequence_Length
x = tf.placeholder(tf.float32, shape=[None, sequence_length, width, height], name='x')
keep_prob = tf.placeholder(tf.float32, name="keep_prob")
x_image = tf.reshape(x, [-1, width, height, num_channels])
y_true = tf.placeholder(tf.float32, shape=[None, num_classes], name='y_true')
layer_conv1, weights_conv1, biases_conv1 = ConvolutionNeuralNetwork.new_conv_layer(input=x_image,
num_input_channels=num_channels,
filter_size=filter_size1,
num_filters=num_filters1,
use_pooling=True)
layer_conv2, weights_conv2, biases_conv2 = ConvolutionNeuralNetwork.new_conv_layer(input=layer_conv1, num_input_channels=num_filters1,
filter_size=filter_size2, num_filters=num_filters2,
use_pooling=True)
layer_conv3, weights_conv3, biases_conv3 = ConvolutionNeuralNetwork.new_conv_layer(input=layer_conv2, num_input_channels=num_filters2,
filter_size=filter_size3, num_filters=num_filters3,
use_pooling=True)
layer_flat, num_features = ConvolutionNeuralNetwork.flatten_layer(layer_conv3)
fc_sequence = tf.reshape(layer_flat, [-1, sequence_length, int(layer_flat.shape[1])])
cell = tf.contrib.rnn.BasicLSTMCell(num_hidden)
cell = tf.contrib.rnn.DropoutWrapper(cell, output_keep_prob=keep_prob)
cell = tf.contrib.rnn.MultiRNNCell([cell] * lstm_layers)
outputs, states = tf.contrib.rnn.static_rnn(cell, tf.unstack(tf.transpose(fc_sequence, perm=[1, 0, 2])), dtype=tf.float32)
self.y_pred = ConvolutionNeuralNetwork.new_fc_layer(outputs[-1], num_hidden, num_classes, use_relu=True)
self.cost = tf.reduce_mean(tf.pow(self.y_pred - y_true, 2))
self.optimizer = tf.train.AdamOptimizer(learning_rate=1e-4).minimize(self.cost)
对于卷积神经网络.py类
import tensorflow as tf
def new_weights(shape):
return tf.Variable(tf.truncated_normal(shape, stddev=0.05))
def new_biases(length):
return tf.Variable(tf.constant(0.05, shape=[length]))
def new_conv_layer(input, num_input_channels, filter_size, num_filters, use_pooling=True, weights=None, biases=None):
shape = [filter_size, filter_size, num_input_channels, num_filters]
if weights is None:
weights = new_weights(shape=shape)
if biases is None:
biases = new_biases(length=num_filters)
layer = tf.nn.conv2d(input=input, filter=weights, strides=[1, 1, 1, 1], padding='SAME')
layer += biases
if use_pooling:
layer = tf.nn.max_pool(value=layer, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
layer = tf.nn.relu(layer)
return layer, weights, biases
def flatten_layer(layer):
layer_shape = layer.get_shape()
num_features = layer_shape[1:4].num_elements()
layer_flat = tf.reshape(layer, [-1, num_features])
return layer_flat, num_features
def flatten_layer_multiple(layer1, layer2):
layer = tf.concat([layer1, layer2], 1)
layer_shape = layer.get_shape()
num_features = layer_shape[1:4].num_elements()
layer_flat = tf.reshape(layer, [-1, num_features])
return layer_flat, num_features
def new_fc_layer(input, num_inputs, num_outputs, use_relu=True, keep_prob=None):
weights = new_weights(shape=[num_inputs, num_outputs])
biases = new_biases(length=num_outputs)
layer = tf.matmul(input, weights) + biases
if use_relu:
layer = tf.nn.relu(layer)
if keep_prob is not None:
tf.nn.dropout(layer, keep_prob)
return layer
以下是我的学习曲线(随时间推移的成本)
这是一个假设结果的例子
这是神经网络预测的结果
神经网络为打击区绘制框,为每个球在完全相同的位置的位置绘制框。有人知道我做错了什么吗?你真的需要在最后一个fc层上使用relu吗?嗯,我可以试着摆脱它。谢谢。你真的需要在最后一个fc层上使用relu吗?嗯,我可以试着去掉它。谢谢