Python Keras在检查形状兼容性时更改输入形状的尺寸_Python_Keras

Python Keras在检查形状兼容性时更改输入形状的尺寸

python keras

Python Keras在检查形状兼容性时更改输入形状的尺寸,python,keras,Python,Keras,我有下面的keras模型，它接受非顺序和顺序输入 # Model parameters units = 100 batch_size = 64 epochs = 1 encoder_inputs = Input(shape=(None, 1), name='encoder') # Allows handling of variable length inputs by applying a binary mask to the specified mask_value. masker = Ma

我有下面的keras模型，它接受非顺序和顺序输入

# Model parameters
units = 100
batch_size = 64
epochs = 1

encoder_inputs = Input(shape=(None, 1), name='encoder')
# Allows handling of variable length inputs by applying a binary mask to the specified mask_value.
masker = Masking(mask_value=sys.float_info.max)
masker(encoder_inputs)


nonseq_inputs = np.array([
    tensors['product_popularity'],
    tensors['quarter_autocorr'],
    tensors['year_autocorr']
]).T

nonseq_dim = nonseq_inputs.shape[1]
nonseq_input = Input(shape=(nonseq_dim,), name='nonsequential_input')
hidden_dense = Dense(units)(nonseq_input)
zeros = Lambda(lambda x: K.zeros_like(x), output_shape=lambda s: s)(hidden_dense)

encoder = LSTM(units, return_state=True)
encoder_outputs, state_h, state_c = encoder(encoder_inputs, initial_state=[hidden_dense, zeros])

# Keep encoder states for decoder, discard outputs
encoder_states = [state_h, state_c]

# Set up the decoder taking the encoder_states to be the initial state vector of the decoder.
decoder_inputs = Input(shape=(None, 1), name='decoder')

# Full output sequences and internal states are returned.  Returned states are used in prediction / inference
masker(decoder_inputs)
decoder = LSTM(units, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder(decoder_inputs, initial_state=encoder_states)

# Gives continuous output at each time step
decoder_dense = Dense(1)
decoder_outputs = decoder_dense(decoder_outputs)

# create model that takes encoder_input_data and decoder_input_data and creates decoder_target_data
model = Model([nonseq_input, encoder_inputs, decoder_inputs], decoder_outputs)

model.summary()

plot_model(model, 'model.png')

# Get encoder inputs and standardise
encoder_input = get_time_block_series(series_array, date_to_index, train_encoding_start, train_encoding_end)
encoder_input, encoder_series_mean = centre_data(encoder_input)

# Get targets for the decoder
decoder_targets = get_time_block_series(series_array, date_to_index, train_pred_start, train_pred_end)
decoder_targets, _ = centre_data(decoder_targets, means=encoder_series_mean)

# Lag the target series to apply teacher forcing to mitigate error propagtion
decoder_input = np.zeros_like(decoder_targets)
decoder_input[:, 1:, 0] = decoder_targets[:, :-1, 0]
decoder_input[:, 0, 0] = encoder_input[:, -1, 0]

model.compile(Adam(), loss='mean_absolute_error')

history = model.fit(
    [nonseq_inputs, encoder_input, decoder_input],
    decoder_targets,
    batch_size=batch_size,
    epochs=epochs,
    validation_split=0.2,
    shuffle=True
)

# Build a model to predict with
encoder_model = Model([nonseq_input, encoder_inputs], encoder_states)

decoder_state_input_h = Input(shape=(units,))
decoder_state_input_c = Input(shape=(units,))
decoder_initial_state = [decoder_state_input_h, decoder_state_input_c]

decoder_outputs, state_h, state_c = decoder(decoder_inputs, initial_state=decoder_initial_state)
decoder_states = [state_h, state_c]

decoder_model = Model([decoder_inputs] + decoder_initial_state, [decoder_outputs] + decoder_states)

# Predict
encoder_input_data = get_time_block_series(series_array, date_to_index, val_encoding_start, val_encoding_end)
encoder_input_data, encoder_series_mean = centre_data(encoder_input_data)

decoder_target_data = get_time_block_series(series_array, date_to_index, val_pred_start, val_pred_end)
decoder_target_data, _ = centre_data(decoder_target_data, encoder_series_mean)

series, y, yhat = predict(
    encoder_model,
    decoder_model,
    encoder_input_data,
    decoder_targets,
    encoder_series_mean,
    horizon,
    sp,
    nonseq_inputs
)

def predict(encoder_model, decoder_model, encoder_input, decoder_targets, means, horizon, sample_index, nonseq_inputs):
    encode_series = encoder_input[sample_index:sample_index + 1]
    nonseq_input = nonseq_inputs[sample_index, :]
    yhat = decode_sequence(encoder_model, decoder_model, encode_series, horizon, nonseq_input)

    encode_series = encode_series.flatten()
    yhat = yhat.flatten()
    y = decoder_targets[sample_index, :, :1].flatten()

    encode_series, yhat, y = invert_transform(encode_series, yhat, y, means[sample_index])

    return encode_series, y, yhat

def decode_sequence(encoder_model, decoder_model, input_sequence, output_length, nonseq_input=None):
    # Encode input as state vectors
    state_values = encoder_model.predict([nonseq_input, input_sequence], batch_size=1)

    # Generate empty target sequence of length 1
    target_sequence = np.zeros((1, 1, 1))

    # Populate the first target sequence with the end of the encoding series
    target_sequence[0, 0, 0] = input_sequence[0, -1, 0]

    # Sampling loop for a batch of sequences - we will fill decoded_sequence with predictions
    # (to simplify we assume a batch_size of 1)
    decoded_sequence = np.zeros((1, output_length, 1))

    for i in range(output_length):
        output, h, c = decoder_model.predict([target_sequence] + state_values)

        decoded_sequence[0, i, 0] = output[0, 0, 0]

        # Update the target sequence (of length 1)
        target_sequence = np.zeros((1, 1, 1))
        target_sequence[0, 0, 0] = output[0, 0, 0]

        # Update states
        state_values = [h, c]

    return decoded_sequence

以下是模型的图像：

当我在一组非顺序输入和一组顺序输入上调用预测函数时，我得到以下错误：

ValueError:检查输入时出错：预期非顺序_输入具有形状（3），但获得具有形状（1）的数组

我可以确认我确实按照模型输入列表中的要求传递了一个形状数组（3，）（我打印出来是为了进行合理性检查）。当我调试代码时，我需要在training_utils.py模块中将输入数据标准化，直至形状检查兼容性：

# Check shapes compatibility.
    if shapes:
        for i in range(len(names)):
            if shapes[i] is not None and not K.is_tensor(data[i]):
                data_shape = data[i].shape
                shape = shapes[i]
                if data[i].ndim != len(shape):
                    raise ValueError(
                        'Error when checking ' + exception_prefix +
                        ': expected ' + names[i] + ' to have ' +
                        str(len(shape)) + ' dimensions, but got array '
                        'with shape ' + str(data_shape))
                if not check_batch_axis:
                    data_shape = data_shape[1:]
                    shape = shape[1:]
                for dim, ref_dim in zip(data_shape, shape):
                    if ref_dim != dim and ref_dim:
                        raise ValueError(
                            'Error when checking ' + exception_prefix +
                            ': expected ' + names[i] + ' to have shape ' +
                            str(shape) + ' but got array with shape ' +
                            str(data_shape))

当我单步执行此代码时，直到“if not check_batch_axis”行，变量数据_shape具有正确的形状尺寸（即3）。但是，此函数总是在check\u batch\u axis=False时调用，这意味着if语句总是通过。在代码的这一部分中，正确设置的数据形状被覆盖，并错误地设置为1：

if not check_batch_axis:
    data_shape = data_shape[1:]
    shape = shape[1:]

我不知道为什么会这样，或者我是否做了其他不正确的事情。我所能确认的是，我在列表中传递给predict函数的numpy数组确实具有正确的形状，但它们在上面提到的代码中发生了更改。有人知道我为什么做得不对吗

该模型基于以下博客文章中的代码：

编辑：在下面请求详细信息

传递到fit函数的数组的形状：

数组以具有以下形状的列表形式传递：

[（478,3），（478,240），（478,26）]

作为背景，我有478个独特的系列；其中有三个时不变特征，我作为第一个输入传入，第二个输入包含实际序列，最后一个元素是解码器的输入，解码器用于预测26个点。我已经更新了上面的代码以显示fit调用的行

编辑2：添加行以在解码功能中打印形状输出：

def decode_sequence(encoder_model, decoder_model, input_sequence, output_length, nonseq_input=None):
    # Encode input as state vectors
    print('nonseq_input.shape: {}'.format(nonseq_input.shape))
    print('input_sequence.shape: {}'.format(input_sequence.shape))
    state_values = encoder_model.predict([nonseq_input, input_sequence], batch_size=1)

（函数的其余部分与以前相同，仅添加在print语句中）。结果如下：

Train on 382 samples, validate on 96 samples
Epoch 1/1
2019-01-13 08:37:08.112955: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
 64/382 [====>.........................] - ETA: 9s - loss: 2.7368
128/382 [=========>....................] - ETA: 4s - loss: 2.6203
192/382 [==============>...............] - ETA: 2s - loss: 2.4305
256/382 [===================>..........] - ETA: 1s - loss: 2.2558
320/382 [========================>.....] - ETA: 0s - loss: 2.2033
382/382 [==============================] - 4s 10ms/step - loss: 2.2386 - val_loss: 3.1458
nonseq_input.shape: (3,)
input_sequence.shape: (1, 240, 1)

例外情况与问题第一部分所述相同。

问题在于输入层需要一批数据，即一个二维数组，其中第一个轴是批次维度，第二个轴是数据维度，但您将单个样本作为一维数组传入。由于

sp

是一个整数，新数组

nonseq\u input=nonseq\u inputs[sample\u index，：]

具有形状

（477,3）

，是一维的。相反，你应该使用

nonseq_input = nonseq_inputs[[sample_index], :]

维护2D阵列。

分析和细节不错！传递到

fit

函数的数组的确切形状是什么？我想知道出了什么问题，但我只想再次检查一下。能否请您包括包含

fit

调用的行，以及每个输入数组的形状？完成后，如果需要更多信息，请告诉我。谢谢你看。@Aesir几个问题：1。什么是

sp

？2.为什么您认为调用

if

块是不正确的？您没有在输入层上指定

batch\u shape

，因此未选中它；因此，层形状和数据形状仅从第二维度开始考虑。3.在将

nonseq\u input.shape

传递给

encoder\u model.predict

之前，它的输出是什么？@a\u guest:sp只是一个整数（100），是我希望预测的序列的索引。区间[0477]上的任何整数都是有效的。我不认为这是错误的，我只是不明白为什么会调用它。我在上面的问题中添加的形状的输出。谢谢你的帮助！非常感谢。你说得很对，我完全没听清楚。