Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/tensorflow/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Keras model.fit和model.PREDICTION在二元分类过程中对完全相同的数据得出了截然不同的结果_Python_Tensorflow_Machine Learning_Keras_Neural Network - Fatal编程技术网

Python Keras model.fit和model.PREDICTION在二元分类过程中对完全相同的数据得出了截然不同的结果

Python Keras model.fit和model.PREDICTION在二元分类过程中对完全相同的数据得出了截然不同的结果,python,tensorflow,machine-learning,keras,neural-network,Python,Tensorflow,Machine Learning,Keras,Neural Network,使用固定的随机种子,我洗牌我的训练数据并生成x\u列,x\u有效,y\u列,y\u有效,然后使用这些新的分割数据集调用model.fit() 我对y_有效数据的AUC和准确性在历元进度输出中相当好: # final validation loss, AUC, accuracy respectively (x_valid, y_valid): Validation Scores: [0.23666608333587646, 0.9644553661346436, 0.8915975689888]

使用固定的随机种子,我洗牌我的训练数据并生成
x\u列,x\u有效,y\u列,y\u有效
,然后使用这些新的分割数据集调用
model.fit()

我对y_有效数据的AUC和准确性在历元进度输出中相当好:

# final validation loss, AUC, accuracy respectively (x_valid, y_valid):
Validation Scores: [0.23666608333587646, 0.9644553661346436, 0.8915975689888]
但是,我决定使用另一个库绘制混淆矩阵,因此我决定调用
y\u preds=model.predict(x\u valid)
,希望结果与使用
model.fit()时看到的一样。我大错特错了:

y_true = pd.Series(y_valid)
y_preds = pd.Series(model.predict(x_valid).squeeze())
print(y_true.value_counts(), `\n`)
print(y_preds.value_counts())
导致

# True label value counts
1.0    140000
0.0    101120
dtype: int64 

# Predicted label value counts
0.0    241119
1.0         1
dtype: int64
显然,来自
model.fit()
的验证分数并非基于这些可怕的预测,而是基于完全相同的数据。发生了什么事

型号的完整代码:

class ModelWrapper():
    def __init__(self, name, transformer, loss=keras.losses.BinaryCrossentropy(), auc=True):
        self.name = name
        self.loss = loss
        self.metrics = [keras.metrics.AUC(), 'accuracy'] if auc else ['accuracy']
        self.transformer = transformer
        self.model = Sequential([
            BatchNormalization(input_shape=input_shape),
            Dense(150, activation='relu'), BatchNormalization(), #Dropout(0.5),
            Dense(150, activation='relu'), BatchNormalization(), #Dropout(0.5),
            Dense(100, activation='relu'), BatchNormalization(), #Dropout(0.5),
            Dense(100, activation='relu'), BatchNormalization(), #Dropout(0.5),
            Dense(100, activation='relu'), BatchNormalization(), #Dropout(0.5),
            Dense(50, activation='relu'), BatchNormalization(), #Dropout(0.5),
            Dense(50, activation='relu'), BatchNormalization(), #Dropout(0.5),
            Dense(1, activation='sigmoid')])
        
        self.x_train, self.x_valid, self.y_train, self.y_valid = transformer(x_train, x_valid, y_train, y_valid)
        self.model.compile(loss=self.loss, metrics=self.metrics, optimizer=keras.optimizers.Adam(lr=3e-4))
        
    def fit(self):
        np.random.seed(124)
        print(f"Training {self.name}")
        history = self.model.fit(self.x_train, self.y_train,
            batch_size=batch_size,
            epochs=epochs,
            verbose=1,
            validation_data=(self.x_valid, self.y_valid),
            callbacks=[keras.callbacks.EarlyStopping(monitor='val_loss', patience=6, verbose=0),
            keras.callbacks.ReduceLROnPlateau(monitor='loss', factor=0.15, patience=3)
            ], workers=16, use_multiprocessing=True)
        score = self.model.evaluate(self.x_valid, self.y_valid, verbose=0)
        print('Validation Scores:', score)
        model_dir = "data"
        
        # save training history
        hist_df = pd.DataFrame(history.history)
        with open('data/history_99e.json', 'w') as f:
            hist_df.to_json(f)
        
        # Save model structure.
        with open((f"{model_dir}/{self.name}.json"), "w") as json_file:
            json_file.write(self.model.to_json())
        
        # save model parameters
        self.model.save(f"{model_dir}/{self.name}.h5")

        return history
    
    def load(self):
        model_dir = "data"
        with open((f"{model_dir}/{self.name}.json"), "r") as json_file:
            self.model = model_from_json(json_file.read())
            self.model.load_weights(f"{model_dir}/{self.name}.h5")
        
    def predict(self, x):
        x = self.model.predict(x=self.transformer.x_scaler.transform(x), verbose=0)
        return x