Machine learning 参数优化&;使用Weka进行阿提布特选择

Machine learning 参数优化&;使用Weka进行阿提布特选择,machine-learning,classification,weka,text-mining,Machine Learning,Classification,Weka,Text Mining,我想在J48中设置-C参数,并运行存储在哈希表中的三种特征选择算法。我想比较精度、真正数、真负数、F1等三种特征选择算法的性能。但当我运行所有特征选择算法时,它们返回相同的输出。。。。我做错什么了吗 Hashtable<String, ASEvaluation> search=new Hashtable<String, ASEvaluation>(); Instances training_data = new Instances(new Buffere

我想在J48中设置
-C
参数,并运行存储在哈希表中的三种特征选择算法。我想比较精度、真正数、真负数、F1等三种特征选择算法的性能。但当我运行所有特征选择算法时,它们返回相同的输出。。。。我做错什么了吗

    Hashtable<String, ASEvaluation> search=new Hashtable<String, ASEvaluation>();

    Instances training_data = new Instances(new BufferedReader(
            new FileReader("test.arff")));
    training_data.setClassIndex(training_data.numAttributes() - 1);
    topAttributes = new int[training_data.numAttributes()];

    AttributeSelectedClassifier classifier = new AttributeSelectedClassifier();
 AttributeSelection attsel = new AttributeSelection();

    search.put("Infogain", new InfoGainAttributeEval());
    search.put("SymmetricalUncertAttribute",new SymmetricalUncertAttributeEval());
    search.put("Chisquared",new ChiSquaredAttributeEval());

    for(String key : search.keySet()) {



        try{
            Ranker attribute_search = new Ranker();
                J48 base = new J48();
            CVParameterSelection ps = new CVParameterSelection();
                ps.setClassifier(base); 
                ps.setNumFolds(5);
                ps.addCVParameter("C 0.1 0.5 5");
                ps.buildClassifier(training_data);

                System.out.println("---------------- " + search.get(key).toString() + " ----------------");

                classifier.setClassifier(ps);                       
            classifier.setEvaluator(search.get(key));
            classifier.setSearch(attribute_search);

                  attsel.setEvaluator(search.get(key));
                  attsel.setSearch(attribute_search);
                  attsel.setInputFormat(training_data);



            Evaluation evaluation = new Evaluation(training_data);
            evaluation.crossValidateModel(ps, training_data, 10, new Random(1));
            System.out.println("\nevaluation ->");          
            System.out.println(evaluation.toSummaryString());
            System.out.println("MAE: " + evaluation.meanAbsoluteError());

        } catch(Exception e) {
            e.printStackTrace();
        }
    }
Hashtable search=newhashtable();
实例训练\u数据=新实例(新BufferedReader(
新的文件阅读器(“test.arff”);
training_data.setClassIndex(training_data.numAttributes()-1);
topAttributes=newint[training_data.numAttributes()];
AttributeSelectedClassifier=新AttributeSelectedClassifier();
AttributeSelection attsel=新的AttributeSelection();
search.put(“Infogain”,newinfogainattributeval());
search.put(“symmetraluncerttribute”,new symmetraluncerttributeeval());
search.put(“Chisquared”,新的chisquaredtributeval());
for(字符串键:search.keySet()){
试一试{
Ranker属性_search=新Ranker();
J48底座=新的J48();
CVParameterSelection ps=新CVParameterSelection();
ps.SET分类器(基础);
ps.setNumFolds(5);
ps.addCVParameter(“C 0.1 0.5”);
ps.buildClassifier(训练数据);
System.out.println(“----------------------”+search.get(key.toString()+”------------”);
分类器。集合分类器(ps);
setEvaluator(search.get(key));
分类器.setSearch(属性搜索);
attsel.setEvaluator(search.get(key));
attsel.setSearch(属性搜索);
attsel.setInputFormat(训练数据);
评估=新评估(培训数据);
交叉验证模型(ps,训练数据,10,新随机(1));
System.out.println(“\nValuation->”);
System.out.println(evaluation.toSummaryString());
System.out.println(“MAE:+evaluation.meanAbsoluteError());
}捕获(例外e){
e、 printStackTrace();
}
}

我不知道你是怎么做的。但是您是否尝试过使用网格搜索?

您是否检查了特征选择算法是否确实对数据集产生了影响?他们可能真的返回了所有相同的功能集。即使它们不这样做,
J48
也可以简单地选择要在生成的树中使用的相同属性子集。