Java OpenNLP分类器输出

Java OpenNLP分类器输出,java,text,machine-learning,opennlp,categorization,Java,Text,Machine Learning,Opennlp,Categorization,目前,我正在使用以下代码来训练分类器模型: final String iterations = "1000"; final String cutoff = "0"; InputStreamFactory dataIn = new MarkableFileInputStreamFactory(new File("src/main/resources/trainingSets/classifierA.txt")); ObjectStream<String>

目前,我正在使用以下代码来训练分类器模型:

    final String iterations = "1000";
    final String cutoff = "0";
    InputStreamFactory dataIn = new MarkableFileInputStreamFactory(new File("src/main/resources/trainingSets/classifierA.txt"));
    ObjectStream<String> lineStream = new PlainTextByLineStream(dataIn, "UTF-8");
    ObjectStream<DocumentSample> sampleStream = new DocumentSampleStream(lineStream);

    TrainingParameters params = new TrainingParameters();
    params.put(TrainingParameters.ITERATIONS_PARAM, iterations);
    params.put(TrainingParameters.CUTOFF_PARAM, cutoff);
    params.put(AbstractTrainer.ALGORITHM_PARAM, NaiveBayesTrainer.NAIVE_BAYES_VALUE);

    DoccatModel model = DocumentCategorizerME.train("NL", sampleStream, params, new DoccatFactory());

    OutputStream modelOut = new BufferedOutputStream(new FileOutputStream("src/main/resources/models/model.bin"));
    model.serialize(modelOut);

    return model;
有人能解释一下这个输出意味着什么吗?如果它告诉我们一些关于准确性的信息?

看看,我们可以知道这个输出是通过以下方法完成的:

如果查看
findParameters()
code,您会注意到它调用
trainingStats()
方法,该方法包含计算精度的代码段:

private double trainingStats(EvalParameters evalParams) {
    // ...
    double trainingAccuracy = (double) numCorrect / numEvents;
    display("Stats: (" + numCorrect + "/" + numEvents + ") " + trainingAccuracy + "\n");
    return trainingAccuracy;
}

TL;DR the
Stats:(998/1474)0.6770691994572592
输出的一部分是您需要的准确性。

谢谢您的好答案,我还有一个问题。
numCorrect
基于什么?此培训集中有编号
998
numbers
2
,其余的在编号
4
下。为什么数字
2
numCorrect
个数?@Patrick
numCorrect
也在
trainingStats()中计算。请看一看。@Patrick如果您觉得这个答案很有用,请不要忘记通过单击上/下箭头下方的Nike徽标来“接受”它。:-)是的,当然,我只是在测试我现在是否理解。返回属于每个类别的
docWords
的概率。
public AbstractModel trainModel(DataIndexer di) {
    // ...
    display("done.\n");
    display("\tNumber of Event Tokens: " + numUniqueEvents + "\n");
    display("\t    Number of Outcomes: " + numOutcomes + "\n");
    display("\t  Number of Predicates: " + numPreds + "\n");
    display("Computing model parameters...\n");
    MutableContext[] finalParameters = findParameters();
    display("...done.\n");
    // ...
}
private double trainingStats(EvalParameters evalParams) {
    // ...
    double trainingAccuracy = (double) numCorrect / numEvents;
    display("Stats: (" + numCorrect + "/" + numEvents + ") " + trainingAccuracy + "\n");
    return trainingAccuracy;
}