Python 我应该使用什么激活函数来强制执行舍入式行为_Python_Tensorflow_Neural Network_Derivative_Activation Function

Python 我应该使用什么激活函数来强制执行舍入式行为

python tensorflow neural-network

Python 我应该使用什么激活函数来强制执行舍入式行为,python,tensorflow,neural-network,derivative,activation-function,Python,Tensorflow,Neural Network,Derivative,Activation Function,我需要一个环绕张量的激活函数函数round（）的导数（梯度）为0（或在tensorflow中为零），这使其无法用作激活函数我正在寻找一个函数，它强制执行类似舍入的行为，这样我的模型的结果就不会只是一个近似的数字。（因为我的标签是整数）我知道公式：tanh○ sigmoid被用来强制{-1,0,1}数只流经模型，因此是否存在一些可推导的函数组合来模拟舍入行为？可能是softmax函数的交叉熵损失tf.nn。softmax_交叉熵_与_logits_v2是您要寻找的，请参见也看看如果您想

我需要一个环绕张量的激活函数

函数round（）的导数（梯度）为0（或在tensorflow中为零），这使其无法用作激活函数

我正在寻找一个函数，它强制执行类似舍入的行为，这样我的模型的结果就不会只是一个近似的数字。（因为我的标签是整数）

我知道公式：tanh○ sigmoid被用来强制{-1,0,1}数只流经模型，因此是否存在一些可推导的函数组合来模拟舍入行为？

可能是softmax函数的交叉熵损失

tf.nn。softmax_交叉熵_与_logits_v2

是您要寻找的，请参见

也看看

如果您想在实线上近似求圆，可以执行以下操作：

def approx_round(x, steepness=1):
    floor_part = tf.floor(x)
    remainder = tf.mod(x, 1)
    return floor_part + tf.sigmoid(steepness*(remainder - 0.5))

def approx_round_grad(x, steepness=1):
    remainder = tf.mod(x, 1)
    sig = tf.sigmoid(steepness*(remainder - 0.5))
    return sig*(1 - sig)

def approx_round_sin(x, width=0.1):
    if width > 1 or width <= 0:
        raise ValueError('Width must be between zero (exclusive) and one (inclusive)')
    floor_part = tf.floor(x)
    remainder = tf.mod(x, 1)
    return (floor_part + clipped_sin(remainder, width))

def clipped_sin(x, width):
    half_width = width/2
    sin_part = (1 + tf.sin(np.pi*((x-0.5)/width)))/2
    whole = sin_part*tf.cast(tf.abs(x - 0.5) < half_width, tf.float32)
    whole += tf.cast(x > 0.5 + half_width, tf.float32)
    return whole

def approx_round_grad_sin(x, width=0.1):
    if width > 1 or width <= 0:
        raise ValueError('Width must be between zero (exclusive) and one (inclusive)')
    remainder = tf.mod(x, 1)
    return clipped_cos(remainder, width)

def clipped_cos(x, width):
    half_width = width/2
    cos_part = np.pi*tf.cos(np.pi*((x-0.5)/width))/(2*width)
    return cos_part*tf.cast(tf.abs(x - 0.5) < half_width, dtype=tf.float32)

事实上，有很多方法可以在Tensorflow中注册您自己的渐变（例如，请参见）。然而，我对实现这一部分并不熟悉，因为我不经常使用Keras/TensorFlow

对于一个函数，它会给你这个近似的梯度，它会是：

def approx_round(x, steepness=1):
    floor_part = tf.floor(x)
    remainder = tf.mod(x, 1)
    return floor_part + tf.sigmoid(steepness*(remainder - 0.5))

def approx_round_grad(x, steepness=1):
    remainder = tf.mod(x, 1)
    sig = tf.sigmoid(steepness*(remainder - 0.5))
    return sig*(1 - sig)

def approx_round_sin(x, width=0.1):
    if width > 1 or width <= 0:
        raise ValueError('Width must be between zero (exclusive) and one (inclusive)')
    floor_part = tf.floor(x)
    remainder = tf.mod(x, 1)
    return (floor_part + clipped_sin(remainder, width))

def clipped_sin(x, width):
    half_width = width/2
    sin_part = (1 + tf.sin(np.pi*((x-0.5)/width)))/2
    whole = sin_part*tf.cast(tf.abs(x - 0.5) < half_width, tf.float32)
    whole += tf.cast(x > 0.5 + half_width, tf.float32)
    return whole

def approx_round_grad_sin(x, width=0.1):
    if width > 1 or width <= 0:
        raise ValueError('Width must be between zero (exclusive) and one (inclusive)')
    remainder = tf.mod(x, 1)
    return clipped_cos(remainder, width)

def clipped_cos(x, width):
    half_width = width/2
    cos_part = np.pi*tf.cos(np.pi*((x-0.5)/width))/(2*width)
    return cos_part*tf.cast(tf.abs(x - 0.5) < half_width, dtype=tf.float32)

要明确的是，这种近似假设您使用的是“足够陡峭”

陡度

参数，因为除了大参数的限制外，sigmoid函数不会精确到0或1

要执行类似于半正弦近似的操作，可以使用以下方法：

def approx_round(x, steepness=1):
    floor_part = tf.floor(x)
    remainder = tf.mod(x, 1)
    return floor_part + tf.sigmoid(steepness*(remainder - 0.5))

def approx_round_grad(x, steepness=1):
    remainder = tf.mod(x, 1)
    sig = tf.sigmoid(steepness*(remainder - 0.5))
    return sig*(1 - sig)

def approx_round_sin(x, width=0.1):
    if width > 1 or width <= 0:
        raise ValueError('Width must be between zero (exclusive) and one (inclusive)')
    floor_part = tf.floor(x)
    remainder = tf.mod(x, 1)
    return (floor_part + clipped_sin(remainder, width))

def clipped_sin(x, width):
    half_width = width/2
    sin_part = (1 + tf.sin(np.pi*((x-0.5)/width)))/2
    whole = sin_part*tf.cast(tf.abs(x - 0.5) < half_width, tf.float32)
    whole += tf.cast(x > 0.5 + half_width, tf.float32)
    return whole

def approx_round_grad_sin(x, width=0.1):
    if width > 1 or width <= 0:
        raise ValueError('Width must be between zero (exclusive) and one (inclusive)')
    remainder = tf.mod(x, 1)
    return clipped_cos(remainder, width)

def clipped_cos(x, width):
    half_width = width/2
    cos_part = np.pi*tf.cos(np.pi*((x-0.5)/width))/(2*width)
    return cos_part*tf.cast(tf.abs(x - 0.5) < half_width, dtype=tf.float32)

def近似圆形（x，宽度=0.1）：
如果宽度>1或宽度0.5+半宽度，tf.32）
还清
def近似圆形梯度正弦（x，宽度=0.1）：
如果宽度>1或宽度@PMende取整本质上是无效的。我完全不同意OP可以写一个有意义的函数来实现这一点。他/她最好的机会是在之后对结果进行四舍五入，并使用某种MSE/Huber损失function四舍五入函数是不可微的。一个可微函数是特别连续的，如果你画出取整函数，你会发现它是不连续的。@PMende，哈？ReLU指定0处的导数（仅为不可微点）为0。这不是数学，而是习惯。没有这样的方法可以做到这一点（零无处不在是你能做到的最好的方法），如果你给它一些思考/了解介绍性微积分，它就会很清楚。@modesitt如果你给它一些思考，了解高级微积分和分布的概念，它就会很清楚，有导数。使用分布作为函数在直接的应用程序中不起作用，但如果这确实是所需的行为，则可以创建近似值。@modesitt round的导数是频率为1、相位为0.5的Dirac梳。每个狄拉克δ函数都可以看作是高斯函数变得无限薄和无限高的极限。如果您想要舍入行为，以及这样一个函数的梯度，您可以为您的标准化高斯选择一些合适的方差。我不是说这在计算上是有效的，但是如果这真的是某人想要的，它是可行的。除非我误解了deepnotes链接中所说的内容，否则这个损失函数对于开域整数（我的标签就是其中的一部分）不是最好的。你说的开域整数是什么意思？如果你的意思是你可以用任何整数作为标签，那么我认为你不会找到任何合适的损失函数。将模型拟合到数据集时，应该已经有了所有可用的标签。否则，安装部分首先就没有意义。我不是在寻找损失函数，而是一个激活函数。我不太熟悉Keras/Tensorflow API，我很难提供封装此功能的建议。原则上，您可能希望以某种方式定义函数，这样您就不必确保将陡度
超参数一致地传递给基函数及其梯度。例如，您可以通过定义第三个函数来实现这一点，该函数以元组的形式返回基函数及其梯度。你的函数在整数值上也不可微。@它可以近似于圆函数，是的，但从数学上讲它是不可微的。请阅读可微性的正式定义。@Tissuebox这太酷了！很高兴听到你成功了。事实上，在中立型网络中，不在训练数据的支持下进行泛化是一个常见的问题。听说您的体系结构在您的培训数据范围之外工作，这很有趣！尽管考虑到激活函数及其“导数”的性质，它对我来说还是有点直观的意义。如果你发表论文，请在致谢信中提及我PI应该特别指出，若使用有界激活函数，神经网络通常很难推广到训练数据范围之外。