Scikit learn 了解onehotencoder的工作原理-为什么我会在ohe专栏中看到多个？_Scikit Learn_Pipeline_Categorical Data_Sklearn Pandas_One Hot Encoding

Scikit learn 了解onehotencoder的工作原理-为什么我会在ohe专栏中看到多个？

scikit-learn

Scikit learn 了解onehotencoder的工作原理-为什么我会在ohe专栏中看到多个？,scikit-learn,pipeline,categorical-data,sklearn-pandas,one-hot-encoding,Scikit Learn,Pipeline,Categorical Data,Sklearn Pandas,One Hot Encoding,我正在使用sklearn管道执行一个热编码： preprocess = make_column_transformer( (MinMaxScaler(),numeric_cols), (OneHotEncoder(),['country']) ) param_grid = { 'xgbclassifier__learning_rate': [0.01,0.005,0.001],

我正在使用sklearn管道执行一个热编码：

preprocess = make_column_transformer(
    (MinMaxScaler(),numeric_cols),
    (OneHotEncoder(),['country'])
    )

param_grid =    { 
                  'xgbclassifier__learning_rate': [0.01,0.005,0.001],
                 
                  }

model = make_pipeline(preprocess,XGBClassifier())

# Initialize Grid Search Modelg
model = GridSearchCV(model,param_grid = param_grid,scoring = 'roc_auc',
                                 verbose= 1,iid= True,
                                     refit = True,cv  = 3)
model.fit(X_train,y_train)

为了了解这些国家是如何统一的，我得到了以下信息（我知道有两个）

其结果是：

有几个问题：

现在纠正我，如果我错了，但在一个热编码中，我认为它是所有0的一系列，只有一个数字1。为什么我在一列中得到几个
当我做model.predict（x_测试）时，它应用piepline fom培训中定义的Trasnformation
调用fit_transform时，如何检索要素名称

OHE

df = pd.DataFrame({"categorical": ["a","b","a"]})
print(df)
  categorical
0           a
1           b
2           a

from sklearn.preprocessing import OneHotEncoder
ohe = OneHotEncoder()
ohe.fit(df)
ohe_out = ohe.transform(df).todense()
# ohe_df = pd.DataFrame(ohe_out, columns=ohe.get_feature_names(df.columns))
ohe_df = pd.DataFrame(ohe_out, columns=ohe.get_feature_names(["categorical"]))
print(ohe_df)
   categorical_a  categorical_b
0            1.0            0.0
1            0.0            1.0
2            1.0            0.0

df = pd.DataFrame({"categorical":["a","b","a"],"nums":[0,1,0]})
print(df)
  categorical  nums
0           a     0
1           b     1
2           a     0

OHE

词汇表

类别

OHE

df = pd.DataFrame({"categorical":["a","b","a"],"nums":[0,1,0]})
print(df)
  categorical  nums
0           a     0
1           b     1
2           a     0

ohe.fit(df)
ohe_out = ohe.transform(df).todense()
# ohe_df = pd.DataFrame(ohe_out, columns=ohe.get_feature_names(df.columns))
ohe_df = pd.DataFrame(ohe_out, columns=ohe.get_feature_names(["categorical","nums"]))
print(ohe_df)
   categorical_a  categorical_b  nums_0  nums_1
0            1.0            0.0     1.0     0.0
1            0.0            1.0     0.0     1.0
2            1.0            0.0     1.0     0.0