Python 如何使用标量常量对Keras中隐藏的标量输出进行加权

Python 如何使用标量常量对Keras中隐藏的标量输出进行加权,python,machine-learning,keras,Python,Machine Learning,Keras,谢谢你抽出时间 我正试图构建一个神经网络,用于预测离散值的回归,但有一个特殊的技巧。应以两种方式(模型A和B)处理输入,然后加权组合。输出通过公式AG+B(1-G)组合,G=1/(1+exp(-gamma*(输入加权-c)))。gamma和c都应该在学习的过程中学习。 我用变量gamma和c以及减法(1-G)挣扎。我当前的代码在两个不同的位置失败: # two models for time series (convolutional approach) input_model_

谢谢你抽出时间

我正试图构建一个神经网络,用于预测离散值的回归,但有一个特殊的技巧。应以两种方式(模型A和B)处理输入,然后加权组合。输出通过公式AG+B(1-G)组合,G=1/(1+exp(-gamma*(输入加权-c)))。gamma和c都应该在学习的过程中学习。 我用变量gamma和c以及减法(1-G)挣扎。我当前的代码在两个不同的位置失败:

    # two models for time series (convolutional approach)
    input_model_A = keras.Input(shape=(12,))
    model_A = Dense(12)(input_model_A)
    input_model_B = keras.Input(shape=(12,))
    model_B = Dense(24)(input_model_B)

    # input for model weighting
    input_weighting = keras.Input(shape=[1,], name="vola_input")

    # exponent = gamma * (input_weighting - c)
    class MyLayer(Layer):
        def __init__(self, **kwargs):
            super(MyLayer, self).__init__(**kwargs)

        def build(self, input_shape=[[1,1],[1,1]]):
            self._c = K.variable(0.5)
            self._gamma = K.variable(0.5)
            self.trainable_weights = [self._c, self._gamma]

            super(MyLayer, self).build(input_shape)  # Be sure to call this at the end

        def call(self, vola, **kwargs):
            intermediate = substract([vola, self._c])
            result = multiply([self._gamma, intermediate])
            return result

        def compute_output_shape(self, input_shape):
            return input_shape[0]

    exponent = MyLayer()(input_weighting)
    # G = 1/(1+exp(-exponent)) 
    G = keras.layers.Dense(1, activation="sigmoid", name="G")(exponent)

    # output = G*A + (1-G)*B
    weighted_A = keras.layers.Multiply(name="layer_A")([model_A.outputs[0], G])
    weighted_B = keras.layers.Multiply(name="layer_B")
    pseudoinput = Input(shape=[1, 1], name="pseudoinput_input",
                        tensor=K.variable([1])) ([model_B.outputs[0], keras.layers.Subtract()([pseudoinput, G])])
    merge_layer = keras.layers.Add(name="merge_layer")([weighted_A, weighted_B])
    output_layer = keras.layers.Dense(units=1, activation='relu', name="output_layer")(merge_layer)
    
    model = keras.Model(inputs=[input_model_A, input_model_B, input_weighting], outputs=[output_layer])
    optimizer = SGD(learning_rate=0.01, momentum=0.0, nesterov=False)
    model.compile(optimizer=optimizer, loss='mean_squared_error')
  • 我的自定义层有一个Bug,我不明白输入维度定义的错误在哪里
  • 我找到并尝试了这两种建议,但都没有成功:


    Francly,我对问题背后的原因感兴趣(或两者都感兴趣),但我更喜欢简单地找到一个能够提供所描述的体系结构的解决方案。

    这是我的建议,有一些评论

    input_model_A = Input(shape=(12,))
    model_A = Dense(24)(input_model_A)
    input_model_B = Input(shape=(12,))
    model_B = Dense(24)(input_model_B)
    # model_A and model_B must have the same last dimensionality
    # otherwise it is impossible to apply Add operation below
    
    # input for model weighting
    input_weighting = Input(shape=(1,), name="vola_input")
    
    class MyLayer(Layer):
        def __init__(self, **kwargs):
            
            super(MyLayer, self).__init__(**kwargs)
            self._c = K.variable(0.5)
            self._gamma = K.variable(0.5)
    
        def call(self, vola, **kwargs):
            x = self._gamma * (vola - self._c) # gamma * (input_weighting - c)
            result = tf.nn.sigmoid(x) # 1 / (1 + exp(-x))
            return result
    
    G = MyLayer()(input_weighting) # 1/(1+exp(-gamma * (input_weighting - c)))
    
    weighted_A = Lambda(lambda x: x[0]*x[1])([model_A,G]) # A*G
    weighted_B = Lambda(lambda x: x[0]*(1-x[1]))([model_B,G]) # B*(1-G)
    
    merge_layer = Add(name="merge_layer")([weighted_A, weighted_B]) # A*G + B*(1-G)
    output_layer = Dense(units=1, activation='relu', name="output_layer")(merge_layer)
    
    model = Model(inputs=[input_model_A, input_model_B, input_weighting], outputs=[output_layer])
    model.compile(optimizer='adam', loss='mean_squared_error')
    
    
    # create dummy data and fit
    n_sample = 100
    Xa = np.random.uniform(0,1, (n_sample,12))
    Xb = np.random.uniform(0,1, (n_sample,12))
    W = np.random.uniform(0,1, n_sample)
    y = np.random.uniform(0,1, n_sample)
    
    model.fit([Xa,Xb,W], y, epochs=3)
    

    这里是正在运行的笔记本:

    在您的实现中,我没有看到公式1/exp(gamma*(input\u weighting-c))的应用。。。你试着做gamma*(输入加权-c),然后与G相乘,G是一个密集层,我使用该层的激活函数。Sigmoid为1/(1+exp(-wi-b)),i为致密层的输入,w为该输入的权重,b为偏差。写这篇文章我意识到,最终真正的伽马帽子将是wgamma。我修正了问题中的公式,因为那个s形正是我想要的公式。好了,现在清楚了。如果您感兴趣,我可以向您提供我的建议/实施Hi Marco,对不起,不知何故,我没有看到您的第二条评论。我还是很感兴趣!感谢您的提议和提醒!非常感谢!我将在接下来的两天内检查它,并向上投票+接受它。非常感谢您的解决方案!它工作得很好。作为旁注,我最初使用Keras单机版。然而,我转而使用tensorflow.keras来使用您的建议(这可能会带来更多好处)。
    File "...\keras\backend\tensorflow_backend.py", line 75, in symbolic_fn_wrapper
        return func(*args, **kwargs)
      File "...\keras\engine\base_layer.py", line 446, in __call__
        self.assert_input_compatibility(inputs)
      File "...\keras\engine\base_layer.py", line 358, in assert_input_compatibility
        str(K.ndim(x)))
    ValueError: Input 0 is incompatible with layer c: expected min_ndim=2, found ndim=1
    
    input_model_A = Input(shape=(12,))
    model_A = Dense(24)(input_model_A)
    input_model_B = Input(shape=(12,))
    model_B = Dense(24)(input_model_B)
    # model_A and model_B must have the same last dimensionality
    # otherwise it is impossible to apply Add operation below
    
    # input for model weighting
    input_weighting = Input(shape=(1,), name="vola_input")
    
    class MyLayer(Layer):
        def __init__(self, **kwargs):
            
            super(MyLayer, self).__init__(**kwargs)
            self._c = K.variable(0.5)
            self._gamma = K.variable(0.5)
    
        def call(self, vola, **kwargs):
            x = self._gamma * (vola - self._c) # gamma * (input_weighting - c)
            result = tf.nn.sigmoid(x) # 1 / (1 + exp(-x))
            return result
    
    G = MyLayer()(input_weighting) # 1/(1+exp(-gamma * (input_weighting - c)))
    
    weighted_A = Lambda(lambda x: x[0]*x[1])([model_A,G]) # A*G
    weighted_B = Lambda(lambda x: x[0]*(1-x[1]))([model_B,G]) # B*(1-G)
    
    merge_layer = Add(name="merge_layer")([weighted_A, weighted_B]) # A*G + B*(1-G)
    output_layer = Dense(units=1, activation='relu', name="output_layer")(merge_layer)
    
    model = Model(inputs=[input_model_A, input_model_B, input_weighting], outputs=[output_layer])
    model.compile(optimizer='adam', loss='mean_squared_error')
    
    
    # create dummy data and fit
    n_sample = 100
    Xa = np.random.uniform(0,1, (n_sample,12))
    Xb = np.random.uniform(0,1, (n_sample,12))
    W = np.random.uniform(0,1, n_sample)
    y = np.random.uniform(0,1, n_sample)
    
    model.fit([Xa,Xb,W], y, epochs=3)