Python 训练多输入神经网络和单输出是否需要分割数据_Python_Tensorflow_Keras_Neural Network

Python 训练多输入神经网络和单输出是否需要分割数据

python tensorflow keras neural-network

Python 训练多输入神经网络和单输出是否需要分割数据,python,tensorflow,keras,neural-network,Python,Tensorflow,Keras,Neural Network,我已经创建了一个多输入单输出神经网络作为回归问题。我已经准备好了我的输入和预期的输出。但是，我目前正在讨论是使用sklearn的train_test_split分割数据还是手动分割数据。这是（）的延续请注意，使用这两种数据分割方法，模型都可以工作，但使用其中一种方法，模型的性能和预测可能会更好。为此，我想知道我是否需要遵循这两种方法中的一种，我不确定这对我的数据建模和预测是好是坏我的数据投入输入1： array([406, 505, 545, ..., 601, 605, 450])

我已经创建了一个多输入单输出神经网络作为回归问题。我已经准备好了我的输入和预期的输出。但是，我目前正在讨论是使用sklearn的train_test_split分割数据还是手动分割数据。这是（）的延续

请注意，使用这两种数据分割方法，模型都可以工作，但使用其中一种方法，模型的性能和预测可能会更好。为此，我想知道我是否需要遵循这两种方法中的一种，我不确定这对我的数据建模和预测是好是坏

我的数据投入 输入1：

array([406, 505, 545, ..., 601, 605, 450])

array([[-2.00370455, -2.35689664, -1.96147382, ...,  2.11014128,
         2.59383321,  1.24209607],
       [-1.97130549, -2.19063663, -2.02996445, ...,  2.32125568,
         2.27316046,  1.48600614],
       [-2.01526666, -2.40440917, -1.94321752, ...,  2.15266657,
         2.68460488,  1.23534095],
       ...,
       [-2.1359458 , -2.52428007, -1.75701785, ...,  2.25480819,
         2.68114281,  1.75468981],
       [-1.95868206, -2.23297167, -1.96401751, ...,  2.07427239,
         2.60306072,  1.28556955],
       [-1.80507278, -2.62199521, -2.08697271, ...,  2.34080577,
         2.48254585,  1.52028871]])>

形状：（1000，）

输入2：

array([406, 505, 545, ..., 601, 605, 450])

array([[-2.00370455, -2.35689664, -1.96147382, ...,  2.11014128,
         2.59383321,  1.24209607],
       [-1.97130549, -2.19063663, -2.02996445, ...,  2.32125568,
         2.27316046,  1.48600614],
       [-2.01526666, -2.40440917, -1.94321752, ...,  2.15266657,
         2.68460488,  1.23534095],
       ...,
       [-2.1359458 , -2.52428007, -1.75701785, ...,  2.25480819,
         2.68114281,  1.75468981],
       [-1.95868206, -2.23297167, -1.96401751, ...,  2.07427239,
         2.60306072,  1.28556955],
       [-1.80507278, -2.62199521, -2.08697271, ...,  2.34080577,
         2.48254585,  1.52028871]])>

形状：（10002000）

数据准备方法1（列车试验分离）方法2（手动拆分）建立神经网络模型拟合

使用train_test_split的一个好处是随机选择数据。如果您的数据中有一个您没有注意到的趋势，或者它是按时间或其他线性模式组织的，那么通过将第一个连续块作为序列，将第二个连续块作为测试，您可能是在底层结构上建模，而不是随机选择

input_embeddings = original_embeddings[:2000]
input_df = df[:2000]
test_embeddings = original_embeddings[2000:]
test_df = df[2000:]    

'Assign Input Values (Training & Validation)'
y, original_model_inputs = create_inputs(input_df, input_embeddings) #Used for Training
y_valid, valid_model_inputs = create_inputs(test_df, test_embeddings) #Used for Validation

def build_model(input1, input2):
    
    """
    Creates the a multi-channel ANN, capable of accepting multiple inputs.

    :param: none
    :return: the model of the ANN with a single output given
    """
    
    input1= np.expand_dims(input_temperature,1)

    # define two sets of inputs for models
    input1= Input(shape = (input1.shape[1], ))
    input2= Input(shape = (input2.shape[1],))

    # The first branch operates on the first input (Temperature)
    x = Dense(units = 128, activation="relu")(input1)
    x = BatchNormalization()(x)  
    x = Dense(units = 128, activation="relu")(x)
    x = BatchNormalization()(x)  
    x = Model(inputs=input1, outputs=x)

    # The second branch operates on the second input (Embeddings)
    y = Dense(units = 128, activation="relu")(input2)
    y = BatchNormalization()(y)
    y = Dense(units = 128, activation="relu")(y)
    # y =Flatten()(y)
    y = BatchNormalization()(y)  
    y = Model(inputs=input2, outputs=y)
    
    # Merge the input models into a single large vector
    combined = Concatenate()([x.output, y.output])
    
    # Apply a FC layer and then a regression activation on the combined outputs
    #z = Dense(2, activation="relu")(combined)
    #z = Dense(1, activation="linear")(z)
    
    #Apply Final Output Layer
    outputs = Dense(128, activation='relu')(combined)
    outputs = Dense(1)(outputs)

    # Create an Interpretation Model
    # The model will accept the inputs of the two branches and then output a single value
    model = Model(inputs = [x.input, y.input], outputs = outputs)

    # Compile the Model
    model.compile(loss='mse', optimizer = Adam(lr = 0.001), metrics = ['mse', 'acc'])

    # Summarize the Model Summary
    model.summary()
    
    return model

history = model.fit([original_model_inputs['input1'], original_model_inputs["input2"]], y, 
          validation_split = 0.2, batch_size = 25, epochs = 75, verbose = False, shuffle = True)