Python 如何在Theano中对共享变量执行条件更新？_Python_Theano

Python 如何在Theano中对共享变量执行条件更新？

python

Python 如何在Theano中对共享变量执行条件更新？,python,theano,Python,Theano,是否有方法根据当前函数的结果有条件地更新共享变量。例如在这个模型中，只有当成本大于0时，我才需要更新权重参数。函数中没有默认的更新参数，但它不适用于“updates”参数。您可以使用符号条件操作。Theano有两个：开关和ifelse开关是按元素执行的，而ifelse的工作方式更像传统的条件转换。有关更多信息，请参阅下面是一个仅在成本为正时更新参数的示例 import numpy import theano import theano.tensor as tt def compile(

是否有方法根据当前函数的结果有条件地更新共享变量。例如

在这个模型中，只有当成本大于0时，我才需要更新权重参数。函数中没有默认的更新参数，但它不适用于“updates”参数。

您可以使用符号条件操作。Theano有两个：

开关

和

ifelse

<代码>开关是按元素执行的，而

ifelse

的工作方式更像传统的条件转换。有关更多信息，请参阅

下面是一个仅在成本为正时更新参数的示例

import numpy
import theano
import theano.tensor as tt


def compile(input_size, hidden_size, output_size, learning_rate):
    w_h = theano.shared(numpy.random.standard_normal((input_size, hidden_size))
                        .astype(theano.config.floatX), name='w_h')
    b_h = theano.shared(numpy.random.standard_normal((hidden_size,))
                        .astype(theano.config.floatX), name='b_h')
    w_y = theano.shared(numpy.random.standard_normal((hidden_size, output_size))
                        .astype(theano.config.floatX), name='w_y')
    b_y = theano.shared(numpy.random.standard_normal((output_size,))
                        .astype(theano.config.floatX), name='b_y')
    x = tt.matrix()
    z = tt.vector()
    h = tt.tanh(theano.dot(x, w_h) + b_h)
    y = theano.dot(h, w_y) + b_y
    c = tt.sum(y - z)
    updates = [(p, p - tt.switch(tt.gt(c, 0), learning_rate * tt.grad(cost=c, wrt=p), 0))
               for p in (w_h, b_h, w_y, b_y)]
    return theano.function([x, z], outputs=c, updates=updates)


def main():
    f = compile(input_size=3, hidden_size=2, output_size=4, learning_rate=0.01)


main()

在这种情况下，可以使用

switch

或

ifelse

，但在这种情况下

switch

通常更可取，因为

ifelse

在整个Theano框架中似乎没有得到很好的支持，需要特殊导入。

感谢您的回答：）。真的很有帮助

import numpy
import theano
import theano.tensor as tt


def compile(input_size, hidden_size, output_size, learning_rate):
    w_h = theano.shared(numpy.random.standard_normal((input_size, hidden_size))
                        .astype(theano.config.floatX), name='w_h')
    b_h = theano.shared(numpy.random.standard_normal((hidden_size,))
                        .astype(theano.config.floatX), name='b_h')
    w_y = theano.shared(numpy.random.standard_normal((hidden_size, output_size))
                        .astype(theano.config.floatX), name='w_y')
    b_y = theano.shared(numpy.random.standard_normal((output_size,))
                        .astype(theano.config.floatX), name='b_y')
    x = tt.matrix()
    z = tt.vector()
    h = tt.tanh(theano.dot(x, w_h) + b_h)
    y = theano.dot(h, w_y) + b_y
    c = tt.sum(y - z)
    updates = [(p, p - tt.switch(tt.gt(c, 0), learning_rate * tt.grad(cost=c, wrt=p), 0))
               for p in (w_h, b_h, w_y, b_y)]
    return theano.function([x, z], outputs=c, updates=updates)


def main():
    f = compile(input_size=3, hidden_size=2, output_size=4, learning_rate=0.01)


main()