Python gridsearch后如何绘制热图并为决策树找到最佳超参数
我需要在网格搜索donorschoose数据集(可从kaggle获得)后,绘制一个热图,以找到决策树的最佳超参数 这里我有两个超参数:Python gridsearch后如何绘制热图并为决策树找到最佳超参数,python,seaborn,data-science,decision-tree,grid-search,Python,Seaborn,Data Science,Decision Tree,Grid Search,我需要在网格搜索donorschoose数据集(可从kaggle获得)后,绘制一个热图,以找到决策树的最佳超参数 这里我有两个超参数: max_depth=[1, 5, 10, 50, 100, 500] min_samples_split=[5, 10, 100, 500] X_tr_bow = hstack((X_train_price_norm,X_train_categories_ohe,X_train_state_ohe,X_train_teacher_ohe,X_train_gra
max_depth=[1, 5, 10, 50, 100, 500]
min_samples_split=[5, 10, 100, 500]
X_tr_bow = hstack((X_train_price_norm,X_train_categories_ohe,X_train_state_ohe,X_train_teacher_ohe,X_train_grade_ohe,X_train_essay__bow,X_train_clean_title__bow)).tocsr()
X_tr_bow
是我在gridsearch中拟合的数据
X_tru_撑弓的尺寸
-(535317980)(53531,)
我在这里面临的错误
Best cross-validation score: 0.59
Best parameters: {'max_depth': 50, 'min_samples_split': 500}
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/pandas/core/internals/managers.py in create_block_manager_from_blocks(blocks, axes)
1650 blocks = [make_block(values=blocks[0],
-> 1651 placement=slice(0, len(axes[0])))]
1652
6 frames
ValueError: Wrong number of items passed 1, placement implies 24
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/pandas/core/internals/managers.py in construction_error(tot_items, block_shape, axes, e)
1689 raise ValueError("Empty data passed with indices specified.")
1690 raise ValueError("Shape of passed values is {0}, indices imply {1}".format(
-> 1691 passed, implied))
1692
1693
ValueError: Shape of passed values is (24, 1), indices imply (24, 24)
万一有人还在找asnwer,下面的代码对我很有用
results = pd.DataFrame.from_dict(rand_search_cv.cv_results_)
max_scores = results.groupby(['param_min_samples_split', 'param_max_depth']).max()
max_scores = max_scores.unstack()[['mean_test_score', 'mean_train_score']]
sn.heatmap(max_scores.mean_test_score, annot=True, fmt='.4g');
results = pd.DataFrame.from_dict(rand_search_cv.cv_results_)
max_scores = results.groupby(['param_min_samples_split', 'param_max_depth']).max()
max_scores = max_scores.unstack()[['mean_test_score', 'mean_train_score']]
sn.heatmap(max_scores.mean_test_score, annot=True, fmt='.4g');