Python scikit能否学习'；s LogisticRegression（）自动将输入数据标准化为z分数？_Python_Scikit Learn_Logistic Regression

Python scikit能否学习'；s LogisticRegression（）自动将输入数据标准化为z分数？

python scikit-learn

Python scikit能否学习'；s LogisticRegression（）自动将输入数据标准化为z分数？,python,scikit-learn,logistic-regression,Python,Scikit Learn,Logistic Regression,是否有一种方法可以让LogisticRegression（）的实例自动将为拟合/训练提供的数据标准化为z-分数以构建模型LinearRegression（）有一个normalize=True参数，但这可能对LogisticRegression（）没有意义如果是这样，在调用predict\u proba（）之前，我是否必须手动规范化未标记的输入向量（即，重新计算每列的平均值、标准偏差）？如果模型已经执行了可能代价高昂的计算，这将是很奇怪的谢谢这就是你要找的吗 from sklearn.data

是否有一种方法可以让

LogisticRegression（）

的实例自动将为拟合/训练提供的数据标准化为

z-分数

以构建模型

LinearRegression（）

有一个

normalize=True

参数，但这可能对

LogisticRegression（）

没有意义

如果是这样，在调用

predict\u proba（）

之前，我是否必须手动规范化未标记的输入向量（即，重新计算每列的平均值、标准偏差）？如果模型已经执行了可能代价高昂的计算，这将是很奇怪的

谢谢这就是你要找的吗

from sklearn.datasets import make_classification
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline
from sklearn.linear_model import LogisticRegression


X, y = make_classification(n_samples=1000, n_features=100, weights=[0.1, 0.9], random_state=0)
X.shape

# build pipe: first standardize by substracting mean and dividing std
# next do classificaiton
pipe = make_pipeline(StandardScaler(), LogisticRegression(class_weight='auto'))

# fit
pipe.fit(X, y)
# predict
pipe.predict_proba(X)

# to get back mean/std
scaler = pipe.steps[0][1]
scaler.mean_
Out[12]: array([ 0.0313, -0.0334,  0.0145, ..., -0.0247,  0.0191,  0.0439])

scaler.std_
Out[13]: array([ 1.    ,  1.0553,  0.9805, ...,  1.0033,  1.0097,  0.9884])

你说的z分数是指x-x.mean（）/x.std（）？是的，这是指“标准分数”的常用方法