Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/397.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/drupal/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java MALLET从文档分类器中获取最有影响力的特征_Java_Nlp_Mallet - Fatal编程技术网

Java MALLET从文档分类器中获取最有影响力的特征

Java MALLET从文档分类器中获取最有影响力的特征,java,nlp,mallet,Java,Nlp,Mallet,通过下面的MALLET示例,我构建了一个文档分类分类器 下一步我要做的是为每个类获取最有影响力的特性。我相信这很简单,但我还没有找到如何从Java实现这一点 感谢您的帮助 我遇到了同样的问题。这是对我有用的东西。(不完全独立,例如,假设您已经有了一个分类器和一些测试数据) 编辑:plig.getInfoGain(q).getValueAtRank(i)显然应该是plig.getInfoGain(q).getValueAtRank(j) PrintWriter debugOut = new Pri

通过下面的MALLET示例,我构建了一个文档分类分类器

下一步我要做的是为每个类获取最有影响力的特性。我相信这很简单,但我还没有找到如何从Java实现这一点


感谢您的帮助

我遇到了同样的问题。这是对我有用的东西。(不完全独立,例如,假设您已经有了一个分类器和一些测试数据)

编辑:plig.getInfoGain(q).getValueAtRank(i)显然应该是plig.getInfoGain(q).getValueAtRank(j

PrintWriter debugOut = new PrintWriter(new File(<filePath>));
InstanceList testInstances = new InstanceList(classifier.getInstancePipe());
CsvIterator reader = new CsvIterator(new FileReader(<path_to_testdata>), \\w+)\\s+(\\w+)\\s+(.*)", 3, 2, 1);  // (data, label, name) field indices         
testInstances.addThruPipe(reader);
PerLabelInfoGain plig = new PerLabelInfoGain (testInstances);
Alphabet alpha = classifier.getAlphabet();
LabelAlphabet la = classifier.getLabelAlphabet();
debugOut.println("debugging label numbers: " + la.size());
for (int q = 0 ; q < la.size(); q++){
    debugOut.println("Class: " + la.lookupLabel(q));
    for (int j = 0; j < 10; j++){
        int alphaId = plig.getInfoGain(q).getIndexAtRank(j);
        Object label = alpha.lookupObject(alphaId);
        debugOut.println(j + "\t" + plig.getInfoGain(q).getValueAtRank(i) + "\t" + label);
    }
    debugOut.println("===============");
}
debugOut.close();
debugging label numbers: 3
Class: sexism
0   0.1257616291393775  sexist
1   0.1257616291393775  rt
2   0.1257616291393775  female
3   0.1257616291393775  notsexist
4   0.1257616291393775  m
5   0.1257616291393775  women
6   0.1257616291393775  mt8_9
7   0.1257616291393775  sports
8   0.1257616291393775  islam
9   0.1257616291393775  men
===============
Class: none
0   0.09383300761779656 sexist
1   0.09383300761779656 mkr
2   0.09383300761779656 female
3   0.09383300761779656 muslims
4   0.09383300761779656 rt
5   0.09383300761779656 notsexist
6   0.09383300761779656 women
7   0.09383300761779656 islam
8   0.09383300761779656 mt8_9
9   0.09383300761779656 mohammed
===============
Class: racism
0   0.062072998255453926    islam
1   0.062072998255453926    muslims
2   0.062072998255453926    mkr
3   0.062072998255453926    mohammed
4   0.062072998255453926    muslim
5   0.062072998255453926    maxblumenthal
6   0.062072998255453926    quran
7   0.062072998255453926    years
8   0.062072998255453926    prophet
9   0.062072998255453926    1400
===============