如何在Java中从TrainValidationSplitModel中提取最佳参数集?
我正在使用ParamGridBuilder构建一个参数网格,用于搜索和训练ValidationSplit,以确定Java中的最佳模型(RandomForestClassifier)。现在,我想知道ParamGridBuilder中生成最佳模型的参数(maxDepth、numTrees)是什么如何在Java中从TrainValidationSplitModel中提取最佳参数集?,java,apache-spark,random-forest,apache-spark-mllib,Java,Apache Spark,Random Forest,Apache Spark Mllib,我正在使用ParamGridBuilder构建一个参数网格,用于搜索和训练ValidationSplit,以确定Java中的最佳模型(RandomForestClassifier)。现在,我想知道ParamGridBuilder中生成最佳模型的参数(maxDepth、numTrees)是什么 Pipeline pipeline = new Pipeline().setStages(new PipelineStage[]{ new VectorAssembler()
Pipeline pipeline = new Pipeline().setStages(new PipelineStage[]{
new VectorAssembler()
.setInputCols(new String[]{"a", "b"}).setOutputCol("features"),
new RandomForestClassifier()
.setLabelCol("label")
.setFeaturesCol("features")});
ParamMap[] paramGrid = new ParamGridBuilder()
.addGrid(rf.maxDepth(), new int[]{10, 15})
.addGrid(rf.numTrees(), new int[]{5, 10})
.build();
BinaryClassificationEvaluator evaluator = new BinaryClassificationEvaluator().setLabelCol("label");
TrainValidationSplit trainValidationSplit = new TrainValidationSplit()
.setEstimator(pipeline)
.setEstimatorParamMaps(paramGrid)
.setEvaluator(evaluator)
.setTrainRatio(0.85);
TrainValidationSplitModel model = trainValidationSplit.fit(dataLog);
System.out.println("paramMap size: " + model.bestModel().paramMap().size());
System.out.println("defaultParamMap size: " + model.bestModel().defaultParamMap().size());
System.out.println("extractParamMap: " + model.bestModel().extractParamMap());
System.out.println("explainParams: " + model.bestModel().explainParams());
System.out.println("numTrees: " + model.bestModel().getParam("numTrees"))//NoSuchElementException: Param numTrees does not exist.
那些尝试没有帮助
paramMap size: 0
defaultParamMap size: 0
extractParamMap: {
}
explainParams:
我找到了一个方法:
Pipeline bestModelPipeline = (Pipeline) model.bestModel().parent();
RandomForestClassifier bestRf = (RandomForestClassifier) bestModelPipeline.getStages()[1];
System.out.println("maxDepth : " + bestRf.getMaxDepth());
System.out.println("numTrees : " + bestRf.getNumTrees());
System.out.println("maxBins : " + bestRf.getMaxBins());