Scikit learn 当我尝试调用score方法时,spark_sklearn包的GridSearchCV失败
我试图使用Scikit learn 当我尝试调用score方法时,spark_sklearn包的GridSearchCV失败,scikit-learn,pyspark,grid-search,Scikit Learn,Pyspark,Grid Search,我试图使用spark\u sklearn包中的GridSearchCV而不是sklearn来利用spark 但当我调用估计器的score方法时,它失败了 我从中获取了示例代码 代码如下所示: def example_ppl(): import numpy as np from sklearn import linear_model, decomposition, datasets from sklearn.pipeline import Pipeline # fr
spark\u sklearn
包中的GridSearchCV
而不是sklearn
来利用spark
但当我调用估计器的score
方法时,它失败了
我从中获取了示例代码
代码如下所示:
def example_ppl():
import numpy as np
from sklearn import linear_model, decomposition, datasets
from sklearn.pipeline import Pipeline
# from sklearn.model_selection import GridSearchCV
from spark_sklearn import GridSearchCV
logistic = linear_model.LogisticRegression()
pca = decomposition.PCA()
pipe = Pipeline(steps=[('pca', pca), ('logistic', logistic)])
digits = datasets.load_digits()
X_digits = digits.data
y_digits = digits.target
n_components = [20, 40, 64]
Cs = np.logspace(-4, 4, 3)
# Create spark context
spark_session = SparkSession.builder.appName('test').getOrCreate()
sc = spark_session.sparkContext
estimator = GridSearchCV(sc,
estimator=pipe,
param_grid=dict(pca__n_components=n_components,
logistic__C=Cs))
print(type(estimator))
estimator.fit(X_digits, y_digits)
# print(estimator.cv_results_)
estimator.score(X_digits,y_digits)
它抛出一个错误,如下所示:
File "D:/Python_Project/test/sklearn_pyspark.py", line 72, in example_ppl
estimator.score(X_digits,y_digits)
File "D:\PyEnvs\test\lib\site-packages\sklearn\model_selection\_search.py", line 436, in score
score = self.scorer_[self.refit] if self.multimetric_ else self.scorer_
AttributeError: 'GridSearchCV' object has no attribute 'multimetric_'
是因为spark\u sklearn
的问题还是我的代码中遗漏了什么