Scikit learn scikit学习排列测试分数中的缩放_Scikit Learn_Permutation_Cross Validation_Standardized

Scikit learn scikit学习排列测试分数中的缩放

scikit-learn

Scikit learn scikit学习排列测试分数中的缩放,scikit-learn,permutation,cross-validation,standardized,Scikit Learn,Permutation,Cross Validation,Standardized,我正在使用scikit学习“排列测试分数”方法来评估我的估计器性能显著性。不幸的是，我无法从scikit学习文档中理解该方法是否对数据实现了任何缩放。我使用标准化工具对数据进行标准化，将训练集标准化应用于测试集函数本身不应用任何缩放。以下是文档中的一个示例： import numpy as np import matplotlib.pyplot as plt from sklearn.svm import SVC from sklearn.model_selection import Str

我正在使用scikit学习“排列测试分数”方法来评估我的估计器性能显著性。不幸的是，我无法从scikit学习文档中理解该方法是否对数据实现了任何缩放。我使用标准化工具对数据进行标准化，将训练集标准化应用于测试集

函数本身不应用任何缩放。 以下是文档中的一个示例：

import numpy as np
import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn.model_selection import StratifiedKFold
from sklearn.model_selection import permutation_test_score
from sklearn import datasets


iris = datasets.load_iris()
X = iris.data
y = iris.target
n_classes = np.unique(y).size

# Some noisy data not correlated
random = np.random.RandomState(seed=0)
E = random.normal(size=(len(X), 2200))

# Add noisy data to the informative features for make the task harder
X = np.c_[X, E]

svm = SVC(kernel='linear')
cv = StratifiedKFold(2)

score, permutation_scores, pvalue = permutation_test_score(
    svm, X, y, scoring="accuracy", cv=cv, n_permutations=100, n_jobs=1)

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

pipe = Pipeline([('scaler', StandardScaler()), ('clf', SVC(kernel='linear'))])
score, permutation_scores, pvalue = permutation_test_score(
        pipe, X, y, scoring="accuracy", cv=cv, n_permutations=100, n_jobs=1)

但是，您可能要做的是通过

permutation\u test\u score

pipeline

，在这里应用缩放。 示例：

import numpy as np
import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn.model_selection import StratifiedKFold
from sklearn.model_selection import permutation_test_score
from sklearn import datasets


iris = datasets.load_iris()
X = iris.data
y = iris.target
n_classes = np.unique(y).size

# Some noisy data not correlated
random = np.random.RandomState(seed=0)
E = random.normal(size=(len(X), 2200))

# Add noisy data to the informative features for make the task harder
X = np.c_[X, E]

svm = SVC(kernel='linear')
cv = StratifiedKFold(2)

score, permutation_scores, pvalue = permutation_test_score(
    svm, X, y, scoring="accuracy", cv=cv, n_permutations=100, n_jobs=1)

from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

pipe = Pipeline([('scaler', StandardScaler()), ('clf', SVC(kernel='linear'))])
score, permutation_scores, pvalue = permutation_test_score(
        pipe, X, y, scoring="accuracy", cv=cv, n_permutations=100, n_jobs=1)

请看我的答案，让我知道。非常感谢你。这就是我一直在寻找的解决方案。