Weka中的不同分类结果:GUI与Java库

Weka中的不同分类结果:GUI与Java库,java,classification,weka,Java,Classification,Weka,在将Weka GUI分类结果与我的Java程序进行比较时,我遇到了一些问题,用iris数据集执行树J48。如果你能帮助我,我将非常感激 我正在使用iris数据集,并试图开发一个Java程序来对新实例进行分类。为此,我使用Weka GUI获得了一个模型iris_treeCV.model,该模型经过10倍的训练和交叉验证。Weka GUI的结果是好的,并且是预期的:4个分类错误的实例。之后,我保存模型以供我的Java程序稍后使用 当我在Java程序中加载模型iris_treeCV.model,并尝试

在将Weka GUI分类结果与我的Java程序进行比较时,我遇到了一些问题,用iris数据集执行树J48。如果你能帮助我,我将非常感激

我正在使用iris数据集,并试图开发一个Java程序来对新实例进行分类。为此,我使用Weka GUI获得了一个模型iris_treeCV.model,该模型经过10倍的训练和交叉验证。Weka GUI的结果是好的,并且是预期的:4个分类错误的实例。之后,我保存模型以供我的Java程序稍后使用

当我在Java程序中加载模型iris_treeCV.model,并尝试对测试数据集的新实例进行分类时,结果是不同的:Java程序可以很好地分类“setosa”和“virginica”,但不能分类“versicolor”。结果如下:

Classification: setosa
Classification: setosa
Classification: virginica
Classification: virginica
Classification: virginica
Classification: virginica
当我期望获得:

Classification: setosa
Classification: setosa
Classification: versicolour
Classification: versicolour
Classification: virginica
Classification: virginica
我读过一些相关的帖子,但是我找不到对使用Java而不是Weka GUI时这种奇怪行为的明确回应

我将Java代码附加到两个类中,然后是培训和测试集。提前谢谢

主要类别:

public static void main(String[] args) {

    try {


        Hashtable<String, String> values = new Hashtable<String, String>();

        //Loading the model
        String pathModel="";
        String pathTestSet="";
        JFileChooser chooserModel = new JFileChooser();
        chooserModel.setCurrentDirectory(new java.io.File("."));
        chooserModel.setDialogTitle("HoliDes: choose the model");
        chooserModel.setFileSelectionMode(JFileChooser.FILES_AND_DIRECTORIES);
        chooserModel.setAcceptAllFileFilterUsed(true);

        if (chooserModel.showOpenDialog(null) == JFileChooser.APPROVE_OPTION) {
            File filePathModel=chooserModel.getSelectedFile();
            pathModel=filePathModel.getPath();

            State irisModel = new State(pathModel);

            //Loading the model
            JFileChooser chooserTestSet = new JFileChooser();
            chooserTestSet.setDialogTitle("HoliDes: choose TEST SET");
            chooserTestSet.setFileSelectionMode(JFileChooser.FILES_AND_DIRECTORIES);
            chooserTestSet.setAcceptAllFileFilterUsed(true);

            //Loading the testing dataset
            if (chooserTestSet.showOpenDialog(null) == JFileChooser.APPROVE_OPTION) {
                File filePathTestSet=chooserTestSet.getSelectedFile();
                pathTestSet=filePathTestSet.getPath();

                //Transforming the data set into pairs attribute-value
                ConverterUtils.DataSource unlabeledSource = new ConverterUtils.DataSource(pathTestSet);
                Instances unlabeledData = unlabeledSource.getDataSet();
                if (unlabeledData.classIndex() == -1){
                    unlabeledData.setClassIndex(unlabeledData.numAttributes() - 1);
                }

                for (int i = 0; i < unlabeledData.numInstances(); i++) {
                    Instance ins=unlabeledData.instance(i);

                    for (int j = 0; j < ins.numAttributes(); j++) {

                        String attrib=ins.attribute(j).name();
                        double val=ins.value(ins.attribute(j));

                        values.put(attrib,String.valueOf(val));

                    }

                    System.out.println("Classification: " + irisModel.classifySpecies(values,pathModel));

                }

            }

        }

    } catch (Exception ex) {
        Logger.getLogger(PilotPatternClassifier.class.getName()).log(Level.SEVERE, null, ex);
    }

}


非常感谢您的帮助。

我认为,将分类器模型应用到测试集时,精度低于使用训练集功能文件检查时的精度是正常的。尝试在这个测试集中使用wekagui,也许你会得到同样的结果。这不是GUI与Java的问题


我本想将此作为评论,但由于缺乏声誉,无法发表评论。

可能是您阅读模型的方式吗?你试过了吗?是的,就是这样:classModel=分类器weka.core.SerializationHelper.readpathModel;分类器classModel=分类器weka.core.SerializationHelper.readpathModel;pathModel是D:\Users\106811\Desktop\iris\u treeCV.modelIgnored类未知实例:6。请注意,我已经修改了我的测试文件,在标题中包含了类,并在数据中添加了“?”。例如:6.0,3.0,4.8,1.8、,?
public class State {

    //private String classModelFile = "/iris_tree.model";    
    private Classifier classModel;
    private Instances dataModel;

    /**
     *  Class constructor.
     */
    public State(String pathModel) throws Exception {
            //InputStream classModelStream;
            //  Create a stream object for the model file embedded within the JAR file.
            //classModelStream = getClass().getResourceAsStream(classModelFile);
            classModel=(Classifier) weka.core.SerializationHelper.read(pathModel);
    }

    /**
     *  Close the instance by setting both the model file string and
     *  the model object itself to null.  When the garbage collector
     *  runs, this should make clean up simpler.  However, the garbage
     *  collector is not called synchronously since that should be
     *  managed by the larger execution environment.
     */
    public void close() {
            classModel = null;
            //classModelFile=null;
    }

    /**
     * Evaluate the model on the data provided by @param measures.
     * This returns a string with the species name.
     *
     * @param measures object with petal and sepal measurements
     * @return string with the species name
     * @throws Exception
     */
    public String classifySpecies(Dictionary<String, String> measures, String pathTestSet) throws Exception {
            FastVector dataClasses = new FastVector();
            FastVector dataAttribs = new FastVector();
            Attribute species;
            double values[] = new double[measures.size() + 1];
            int i = 0, maxIndex = 0;

            //  Assemble the potential species options.
            dataClasses.addElement("setosa");
            dataClasses.addElement("versicolour");
            dataClasses.addElement("virginica");
            species = new Attribute("species", dataClasses);

            //  Create the object to classify on.
            for (Enumeration<String> keys = measures.keys(); keys.hasMoreElements(); ) {

                    String key = keys.nextElement();
                    double val = Double.parseDouble(measures.get(key));         
                    dataAttribs.addElement(new Attribute(key));

                    values[i++] = val;

            }

            dataAttribs.addElement(species);
            dataModel = new Instances("iris-test", dataAttribs, 0);//"classify" is the name of the relationship of the test file. It is arbitrary
            dataModel.setClass(species);

            Instance ins=new DenseInstance(1, values);
            //dataModel.add(new Instance(1, values) {});            
            dataModel.add(ins);            
            dataModel.instance(0).setClassMissing();

            //  Find the class with the highest estimated likelihood
            double cl[] = classModel.distributionForInstance(dataModel.instance(0));
            for(i = 0; i < cl.length; i++){
                if(cl[i] > cl[maxIndex]){
                        maxIndex = i;
                }
            }
            return dataModel.classAttribute().value(maxIndex);


    }


}
@RELATION iris-train

@ATTRIBUTE sepallength  REAL
@ATTRIBUTE sepalwidth   REAL
@ATTRIBUTE petallength  REAL
@ATTRIBUTE petalwidth   REAL
@ATTRIBUTE species  {setosa,versicolour,virginica}

@DATA
5.1,3.5,1.4,0.2,setosa
4.9,3.0,1.4,0.2,setosa
4.7,3.2,1.3,0.2,setosa
4.6,3.1,1.5,0.2,setosa
5.0,3.6,1.4,0.2,setosa
5.4,3.9,1.7,0.4,setosa
4.6,3.4,1.4,0.3,setosa
5.0,3.4,1.5,0.2,setosa
4.4,2.9,1.4,0.2,setosa
4.9,3.1,1.5,0.1,setosa
5.4,3.7,1.5,0.2,setosa
4.8,3.4,1.6,0.2,setosa
4.8,3.0,1.4,0.1,setosa
4.3,3.0,1.1,0.1,setosa
5.8,4.0,1.2,0.2,setosa
5.7,4.4,1.5,0.4,setosa
5.4,3.9,1.3,0.4,setosa
5.1,3.5,1.4,0.3,setosa
5.7,3.8,1.7,0.3,setosa
5.1,3.8,1.5,0.3,setosa
5.4,3.4,1.7,0.2,setosa
5.1,3.7,1.5,0.4,setosa
4.6,3.6,1.0,0.2,setosa
5.1,3.3,1.7,0.5,setosa
4.8,3.4,1.9,0.2,setosa
5.0,3.0,1.6,0.2,setosa
5.0,3.4,1.6,0.4,setosa
5.2,3.5,1.5,0.2,setosa
5.2,3.4,1.4,0.2,setosa
4.7,3.2,1.6,0.2,setosa
4.8,3.1,1.6,0.2,setosa
5.4,3.4,1.5,0.4,setosa
5.2,4.1,1.5,0.1,setosa
5.5,4.2,1.4,0.2,setosa
4.9,3.1,1.5,0.1,setosa
5.0,3.2,1.2,0.2,setosa
5.5,3.5,1.3,0.2,setosa
4.9,3.1,1.5,0.1,setosa
4.4,3.0,1.3,0.2,setosa
5.1,3.4,1.5,0.2,setosa
5.0,3.5,1.3,0.3,setosa
4.5,2.3,1.3,0.3,setosa
4.4,3.2,1.3,0.2,setosa
5.0,3.5,1.6,0.6,setosa
5.1,3.8,1.9,0.4,setosa
4.8,3.0,1.4,0.3,setosa
5.1,3.8,1.6,0.2,setosa
4.6,3.2,1.4,0.2,setosa
5.3,3.7,1.5,0.2,setosa
5.0,3.3,1.4,0.2,setosa
7.0,3.2,4.7,1.4,versicolour
6.4,3.2,4.5,1.5,versicolour
6.9,3.1,4.9,1.5,versicolour
5.5,2.3,4.0,1.3,versicolour
6.5,2.8,4.6,1.5,versicolour
5.7,2.8,4.5,1.3,versicolour
6.3,3.3,4.7,1.6,versicolour
4.9,2.4,3.3,1.0,versicolour
6.6,2.9,4.6,1.3,versicolour
5.2,2.7,3.9,1.4,versicolour
5.0,2.0,3.5,1.0,versicolour
5.9,3.0,4.2,1.5,versicolour
6.0,2.2,4.0,1.0,versicolour
6.1,2.9,4.7,1.4,versicolour
5.6,2.9,3.6,1.3,versicolour
6.7,3.1,4.4,1.4,versicolour
5.6,3.0,4.5,1.5,versicolour
5.8,2.7,4.1,1.0,versicolour
6.2,2.2,4.5,1.5,versicolour
5.6,2.5,3.9,1.1,versicolour
5.9,3.2,4.8,1.8,versicolour
6.1,2.8,4.0,1.3,versicolour
6.3,2.5,4.9,1.5,versicolour
6.1,2.8,4.7,1.2,versicolour
6.4,2.9,4.3,1.3,versicolour
6.6,3.0,4.4,1.4,versicolour
6.8,2.8,4.8,1.4,versicolour
6.7,3.0,5.0,1.7,versicolour
6.0,2.9,4.5,1.5,versicolour
5.7,2.6,3.5,1.0,versicolour
5.5,2.4,3.8,1.1,versicolour
5.5,2.4,3.7,1.0,versicolour
5.8,2.7,3.9,1.2,versicolour
6.0,2.7,5.1,1.6,versicolour
5.4,3.0,4.5,1.5,versicolour
6.0,3.4,4.5,1.6,versicolour
6.7,3.1,4.7,1.5,versicolour
6.3,2.3,4.4,1.3,versicolour
5.6,3.0,4.1,1.3,versicolour
5.5,2.5,4.0,1.3,versicolour
5.5,2.6,4.4,1.2,versicolour
6.1,3.0,4.6,1.4,versicolour
5.8,2.6,4.0,1.2,versicolour
5.0,2.3,3.3,1.0,versicolour
5.6,2.7,4.2,1.3,versicolour
5.7,3.0,4.2,1.2,versicolour
5.7,2.9,4.2,1.3,versicolour
6.2,2.9,4.3,1.3,versicolour
5.1,2.5,3.0,1.1,versicolour
5.7,2.8,4.1,1.3,versicolour
6.3,3.3,6.0,2.5,virginica
5.8,2.7,5.1,1.9,virginica
7.1,3.0,5.9,2.1,virginica
6.3,2.9,5.6,1.8,virginica
6.5,3.0,5.8,2.2,virginica
7.6,3.0,6.6,2.1,virginica
4.9,2.5,4.5,1.7,virginica
7.3,2.9,6.3,1.8,virginica
6.7,2.5,5.8,1.8,virginica
7.2,3.6,6.1,2.5,virginica
6.5,3.2,5.1,2.0,virginica
6.4,2.7,5.3,1.9,virginica
6.8,3.0,5.5,2.1,virginica
5.7,2.5,5.0,2.0,virginica
5.8,2.8,5.1,2.4,virginica
6.4,3.2,5.3,2.3,virginica
6.5,3.0,5.5,1.8,virginica
7.7,3.8,6.7,2.2,virginica
7.7,2.6,6.9,2.3,virginica
6.0,2.2,5.0,1.5,virginica
6.9,3.2,5.7,2.3,virginica
5.6,2.8,4.9,2.0,virginica
7.7,2.8,6.7,2.0,virginica
6.3,2.7,4.9,1.8,virginica
6.7,3.3,5.7,2.1,virginica
7.2,3.2,6.0,1.8,virginica
6.2,2.8,4.8,1.8,virginica
6.1,3.0,4.9,1.8,virginica
6.4,2.8,5.6,2.1,virginica
7.2,3.0,5.8,1.6,virginica
7.4,2.8,6.1,1.9,virginica
7.9,3.8,6.4,2.0,virginica
6.4,2.8,5.6,2.2,virginica
6.3,2.8,5.1,1.5,virginica
6.1,2.6,5.6,1.4,virginica
7.7,3.0,6.1,2.3,virginica
6.3,3.4,5.6,2.4,virginica
6.4,3.1,5.5,1.8,virginica
6.0,3.0,4.8,1.8,virginica
6.9,3.1,5.4,2.1,virginica
6.7,3.1,5.6,2.4,virginica
6.9,3.1,5.1,2.3,virginica
5.8,2.7,5.1,1.9,virginica
6.8,3.2,5.9,2.3,virginica
6.7,3.3,5.7,2.5,virginica
6.7,3.0,5.2,2.3,virginica
6.3,2.5,5.0,1.9,virginica
6.5,3.0,5.2,2.0,virginica
6.2,3.4,5.4,2.3,virginica
5.9,3.0,5.1,1.8,virginica
@RELATION iris-test

@ATTRIBUTE sepallength  REAL
@ATTRIBUTE sepalwidth   REAL
@ATTRIBUTE petallength  REAL
@ATTRIBUTE petalwidth   REAL

@DATA
5.1,3.5,1.4,0.2
4.9,3.0,1.4,0.2
6.6,3.0,4.4,1.4
6.8,2.8,4.8,1.4
6.4,3.1,5.5,1.8
6.0,3.0,4.8,1.8