Scikit learn keras gridSearchCV上的sklearn One热编码数据
这段代码的问题是我给了分类器, 一个热编码数据: 指:Scikit learn keras gridSearchCV上的sklearn One热编码数据,scikit-learn,neural-network,keras,grid-search,one-hot-encoding,Scikit Learn,Neural Network,Keras,Grid Search,One Hot Encoding,这段代码的问题是我给了分类器, 一个热编码数据: 指: X-train,X-test,y\u-train,y\u-test是一种热编码。 但分类器正在预测输出: y\u pred\u测试,y\u pred\u列车以数字形式 (我认为这也是不正确的)。有人能帮忙吗? 这是一个虚拟示例,因此不必担心低精度,只需知道为什么它不是以一种热编码形式预测输出。 谢谢 分类器正在预测一个类的概率。如果您想要最终预测,请使用:y_pred.argmax(axis=-1)对不起,我不理解您的建议。y_测试是[[
X-train
,X-test
,y\u-train
,y\u-test
是一种热编码。
但分类器正在预测输出:
y\u pred\u测试
,y\u pred\u列车
以数字形式
(我认为这也是不正确的)。有人能帮忙吗?
这是一个虚拟示例,因此不必担心低精度,只需知道为什么它不是以一种热编码形式预测输出。
谢谢
分类器正在预测一个类的概率。如果您想要最终预测,请使用:
y_pred.argmax(axis=-1)
对不起,我不理解您的建议。y_测试是[[0.0.0.0.0.0.0.1.0.][0.1.0.0.0.0.0.0.0.0.0.1.0.0.][0.0.0.0.0.][2.3.5]之前的y_hat_测试”和“应用axis=-1事物之后的y_hat_测试”。这意味着什么?您应该只将其应用于来自keras.model
的预测。y_hat_test是预测值。只需添加一点,您的建议是在我保存模型并加载模型后对“预测”的输出进行处理。因此,确切的故事是。1) 网格搜索CV应用于Keras模型。2) 模型根据列车数据进行拟合。3) 模型现在以标签编码的形式进行预测(我可以解码)4)以hdf5格式保存keras模型5)加载keras模型,现在这个加载的模型进行预测概率类型输出,如每个样本的每个类的概率。
# -*- coding: utf-8 -*-
import numpy as np
import pandas as pd
x=pd.DataFrame()
x['names']= np.arange(1,10)
x['Age'] = np.arange(1,10)
y=pd.DataFrame()
y['target'] = np.arange(1,10)
from sklearn.preprocessing import OneHotEncoder, Normalizer
ohX= OneHotEncoder()
x_enc = ohX.fit_transform(x).toarray()
ohY = OneHotEncoder()
y_enc = ohY.fit_transform(y).toarray()
print (x_enc)
print("____")
print (y_enc)
import keras
from keras import regularizers
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.models import load_model
from keras.layers.advanced_activations import LeakyReLU
marker="-------"
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import train_test_split
def create_model(learn_rate=0.001):
model = Sequential()
model.add(Dense(units = 15, input_dim =18,kernel_initializer= 'normal', activation="tanh"))
model.add(Dense(units=9, activation = "softmax"))
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=['accuracy'])
return model
if __name__=="__main__":
X_train, X_test, y_train, y_test = train_test_split(x_enc, y_enc, test_size=0.33, random_state=42)
print ("\n\n",marker*5," Classification\nX_train shape is: ",X_train.shape,"\tX_test shape is:",X_test.shape)
print ("\ny_train shape is: ",y_train.shape,"\t y_test shape is:",y_test.shape,"\n\n")
norm = Normalizer()
#model
X_train = norm.fit_transform(X_train)
X_test = norm.transform(X_test)
earlyStopping=keras.callbacks.EarlyStopping(monitor='val_loss', patience=0, verbose=0, mode='auto')
model = KerasClassifier(build_fn=create_model, verbose=0)
fit_params={'callbacks': [earlyStopping]}
#grid
# batch_size =[50,100,200, 300,400]
epochs = [2,5]
learn_rate=[0.1,0.001]
param_grid = dict( epochs = epochs, learn_rate = learn_rate)
grid = GridSearchCV(estimator = model, param_grid = param_grid, n_jobs=1)
#Predicting
print (np.shape(X_train), np.shape(y_train))
y_train = np.reshape(y_train, (-1,np.shape(y_train)[1]))
print ("y_train shape after reshaping", np.shape(y_train))
grid_result = grid.fit(X_train, y_train, callbacks=[earlyStopping])
print ("grid score using params: ", grid_result.best_score_, " ",grid_result.best_params_)
#scores
print("SCORES")
print (grid_result.score(X_test,y_test))
# summarize results
#print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
#means = grid_result.cv_results_['mean_test_score']
#stds = grid_result.cv_results_['std_test_score']
#params = grid_result.cv_results_['params']
#for mean, stdev, param in zip(means, stds, params):
# print("%f (%f) with: %r" % (mean, stdev, param))
print("\n\n")
print("y_test is",y_test)
y_hat_test = grid.predict(X_test)
y_hat_train = grid.predict(X_train)
print("y_hat_test is ", y_hat_test)