Python 将符号theano.tensor传递到编译后的theano.function
我正在尝试重构我的代码,以便更容易地更改架构。目前,我正在构建一个递归神经网络,如下所示Python 将符号theano.tensor传递到编译后的theano.function,python,theano,Python,Theano,我正在尝试重构我的代码,以便更容易地更改架构。目前,我正在构建一个递归神经网络,如下所示 # input (where first dimension is time) x = T.matrix() # target (where first dimension is time) t = T.matrix() # recurrent weights as a shared variable W_hh = theano.shared(numpy.random.uniform(size=(n, n
# input (where first dimension is time)
x = T.matrix()
# target (where first dimension is time)
t = T.matrix()
# recurrent weights as a shared variable
W_hh = theano.shared(numpy.random.uniform(size=(n, n), low=-.01, high=.01))
# input to hidden layer weights
W_hx = theano.shared(numpy.random.uniform(size=(n, nin), low=-.01, high=.01))
# hidden to output layer weights
W_yh = theano.shared(numpy.random.uniform(size=(nout, n), low=-.01, high=.01))
# hidden layer bias weights
b_h = theano.shared(numpy.zeros((n)))
# output layer bias weights
b_y = theano.shared(numpy.zeros((nout)))
# initial hidden state of the RNN
h0 = theano.shared(numpy.zeros((n)))
# recurrent function
def step(x_t, h_tm1):
h_t = T.nnet.sigmoid(T.dot(W_hx, x_t) + T.dot(W_hh, h_tm1) + b_h)
y_t = T.nnet.sigmoid(T.dot(W_yh, h_t) + b_y)
return h_t, y_t
# loop over the recurrent function for the entire sequence
[h, y], _ = theano.scan(step,
sequences=x,
outputs_info=[h0, None])
# predict function outputs y for a given x
predict = theano.function(inputs=[x,], outputs=y)
这个很好用。但是这个实现的问题是,我必须硬编码权重,并确保每次更改架构时所有的数学都是正确的。受此启发,我尝试通过引入一个层类来重构代码
class Layer:
def __init__(self, inputs=[], nins=[], nout=None, Ws=[], b=None, activation=T.tanh):
"""
inputs: an array of theano symbolic vectors
activation: the activation function for the hidden layer
nins, nouts, Ws, bs: either pass the dimensions of the inputs and outputs, or pass
the shared theano tensors for the weights and bias.
"""
n = len(inputs)
assert(n is not 0)
self.inputs = inputs
self.activation = activation
# create the shared weights if necessary
if len(Ws) is 0:
assert(len(nins) is n)
assert(nout is not None)
for i in range(n):
input = inputs[i]
nin = nins[i]
W = theano.shared(
numpy.random.uniform(
size=(nout, nin),
low=-numpy.sqrt(6. / (nin + nout)),
high=numpy.sqrt(6. / (nin + nout))
),
)
Ws.append(W)
# create the shared biases if necessary
if b is None:
assert(nout is not None)
b = theano.shared(numpy.zeros((nout,)))
self.Ws = Ws
self.b = b
self.params = self.Ws + [b]
self.weights = Ws
linear = self.b
for i in range(n):
linear += T.dot(self.Ws[i], self.inputs[i])
if self.activation:
self.output = self.activation(linear)
else:
self.output = linear
这使我能够更干净、更不容易出错、更容易编写RNN代码
改变架构
# one step of the input
x = T.vector()
# the previous hidden layer
h_tm1 = T.vector()
# the input and the hidden layer go into the input layer
hiddenLayer = Layer(inputs=[x, h_tm1],
nins=[nin, n],
nout=n,
activation=T.nnet.sigmoid)
# the hidden layer vector
h = hiddenLayer.output
# the hidden layer output goes to the output
outputLayer = Layer(inputs=[h],
nins=[n],
nout=nout,
activation=T.nnet.sigmoid)
# the output layer vector
y = outputLayer.output
# recurrent function
step = theano.function(inputs=[x, h_tm1],
outputs=[h, y])
# next we need to scan over all steps for a given array of observations
# input (where first dimension is time)
Xs = T.matrix()
# initial hidden state of the RNN
h0 = theano.shared(numpy.zeros((n)))
# loop over the recurrent function for the entire sequence
[Hs, Ys], _ = theano.scan(step,
sequences=Xs,
outputs_info=[h0, None])
# predict function outputs y for a given x
predict = theano.function(inputs=[Xs,], outputs=Ys)
然而,当我运行我的程序时,我得到了一个错误
TypeError: ('Bad input argument to theano function at index 0(0-based)', 'Expected an array-like object, but found a Variable: maybe you are trying to call a function on a (possibly shared) variable instead of a numeric array?')
这里的问题是扫描操作将符号变量传递给Xs的子传感器
到编译的step函数
重构我的代码的全部要点是,我不必在step函数中定义所有的计算。现在我只剩下4个符号变量x,h_tm1,h,y,它们定义了我需要用Xs扫描的计算图的一段。但是,我不知道该怎么做,因为theano.function不能接受符号变量
下面是一个简单的示例,说明我正在使用
你知道如何避开这个错误吗?你基本上不能把编译好的Theano函数用作扫描操作 解决方法是让图层类具有一个返回函数的函数,该函数将构建计算树,然后您可以使用该树编译扫描操作。因此解决方案是使用ano.clone和replaces关键字参数。例如,在求幂示例中,可以按如下方式定义步长函数:
def step(p, a):
replaces = {prior_result: p, A: a}
n = theano.clone(next_result, replace=replaces)
return n
theano.克隆人就是我要找的。
def step(p, a):
replaces = {prior_result: p, A: a}
n = theano.clone(next_result, replace=replaces)
return n