Python Theano隐藏层激活函数_Python_Machine Learning_Neural Network_Theano

Python Theano隐藏层激活函数

python machine-learning neural-network

Python Theano隐藏层激活函数,python,machine-learning,neural-network,theano,Python,Machine Learning,Neural Network,Theano,是否可以使用校正线性单元（ReLU）作为隐藏层的激活函数，而不是使用中的tanh（）或sigmoid（）？隐藏层的实现如下所示，据我在互联网上搜索到的情况，ReLU在Theano中没有实现 class HiddenLayer(object): def __init__(self, rng, input, n_in, n_out, W=None, b=None, activation=T.tanh): pass relu在Theano中很容易实现： switch(x<0, 0,

是否可以使用校正线性单元（ReLU）作为隐藏层的激活函数，而不是使用中的

tanh（）

或

sigmoid（）

？隐藏层的实现如下所示，据我在互联网上搜索到的情况，ReLU在Theano中没有实现

class HiddenLayer(object):
  def __init__(self, rng, input, n_in, n_out, W=None, b=None, activation=T.tanh):
    pass

relu在Theano中很容易实现：

switch(x<0, 0, x)

开关（x我认为这样写更精确：
x * (x > 0.) + 0. * (x < 0.)

x*（x>0.）+0.*（x<0.）
我是这样写的：
lambda x: T.maximum(0,x)

或：
更新：最新版本的theano本机支持ReLU:
，应优先于定制解决方案
我决定比较解的速度，因为它对NNs非常重要。比较函数本身的速度和它的梯度，在第一种情况下，首选开关，对于x*（x>0）梯度更快。
所有计算的梯度都是正确的
def relu1(x):
    return T.switch(x<0, 0, x)

def relu2(x):
    return T.maximum(x, 0)

def relu3(x):
    return x * (x > 0)


z = numpy.random.normal(size=[1000, 1000])
for f in [relu1, relu2, relu3]:
    x = theano.tensor.matrix()
    fun = theano.function([x], f(x))
    %timeit fun(z)
    assert numpy.all(fun(z) == numpy.where(z > 0, z, 0))

Output: (time to compute ReLU function)
>100 loops, best of 3: 3.09 ms per loop
>100 loops, best of 3: 8.47 ms per loop
>100 loops, best of 3: 7.87 ms per loop

for f in [relu1, relu2, relu3]:
    x = theano.tensor.matrix()
    fun = theano.function([x], theano.grad(T.sum(f(x)), x))
    %timeit fun(z)
    assert numpy.all(fun(z) == (z > 0)

Output: time to compute gradient 
>100 loops, best of 3: 8.3 ms per loop
>100 loops, best of 3: 7.46 ms per loop
>100 loops, best of 3: 5.74 ms per loop

因此theano会为gradient生成不合适的代码。IMHO，最好选择今天的switch版本。
该函数在Python中非常简单：
def relu(input):
    output = max(input, 0)
    return(output)

0.*（x<0。）
将得到优化。因此执行的公式将是x*（x>0）
这里如何处理零的不可微性？@nouiz我刚在笔记本电脑上安装了Theano。库中不包括nnet.relu。但是，我可以在几天前安装Theano的台式机上使用nnet.relu。原因可能是什么？@Amir，这是因为它们没有相同的Theano版本。没有relu使用最新发布的Theano版本0.7，带有relu的版本使用开发版本（它是稳定的，我们建议人们使用）：这是从哪里来的？请注意，当你关心GPU速度时，T.max是最快的。另请参见。@Albert，不，我决定比较我在这里找到的版本（不幸的是，我没有GPU，所以这些是CPU结果）。感谢第一个链接！关于速度的一些后续讨论是。似乎，根据0.7中应该出现的版本，relu在theano.tensor.nnet中不存在，但我让pip3显示theano---名称：theano版本：0.7.0位置：/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages和[5]：theano.tensor.nnet.relu-->1 theano.tensor.nnet.relu AttributeError:“module”对象没有属性“relu”有人遇到过这种情况吗？我刚刚得到一些线索。如果使用pip安装theano（即使从头开始安装并使用-I选项），也会安装0.7稳定版本。但现在（我正在写这篇评论）但如果安装最新版本：pip3安装--升级--没有deps git+git://github.com/Theano/Theano.git  然后relu出现在theano.tensor.nnet中
def relu1(x):
    return T.switch(x<0, 0, x)

def relu2(x):
    return T.maximum(x, 0)

def relu3(x):
    return x * (x > 0)


z = numpy.random.normal(size=[1000, 1000])
for f in [relu1, relu2, relu3]:
    x = theano.tensor.matrix()
    fun = theano.function([x], f(x))
    %timeit fun(z)
    assert numpy.all(fun(z) == numpy.where(z > 0, z, 0))

Output: (time to compute ReLU function)
>100 loops, best of 3: 3.09 ms per loop
>100 loops, best of 3: 8.47 ms per loop
>100 loops, best of 3: 7.87 ms per loop

for f in [relu1, relu2, relu3]:
    x = theano.tensor.matrix()
    fun = theano.function([x], theano.grad(T.sum(f(x)), x))
    %timeit fun(z)
    assert numpy.all(fun(z) == (z > 0)

Output: time to compute gradient 
>100 loops, best of 3: 8.3 ms per loop
>100 loops, best of 3: 7.46 ms per loop
>100 loops, best of 3: 5.74 ms per loop

x = theano.tensor.matrix()
fun = theano.function([x], x > 0)
%timeit fun(z)
Output:
>100 loops, best of 3: 2.77 ms per loop

def relu(input):
    output = max(input, 0)
    return(output)