Java lucene中的精度和召回测量问题
我需要在lucene中计算精度和召回值,我使用这个源代码来实现这一点Java lucene中的精度和召回测量问题,java,lucene,Java,Lucene,我需要在lucene中计算精度和召回值,我使用这个源代码来实现这一点 public class PrecisionRecall { public static void main(String[] args) throws Throwable { File topicsFile = new File("C:/Users/Raden/Documents/lucene/LuceneHibernate/LIA/lia2e/src/lia/benchmark/topics.txt"); File q
public class PrecisionRecall {
public static void main(String[] args) throws Throwable {
File topicsFile = new File("C:/Users/Raden/Documents/lucene/LuceneHibernate/LIA/lia2e/src/lia/benchmark/topics.txt");
File qrelsFile = new File("C:/Users/Raden/Documents/lucene/LuceneHibernate/LIA/lia2e/src/lia/benchmark/qrels.txt");
Directory dir = FSDirectory.open(new File("C:/Users/Raden/Documents/myindex"));
Searcher searcher = new IndexSearcher(dir, true);
String docNameField = "filename";
PrintWriter logger = new PrintWriter(System.out, true);
TrecTopicsReader qReader = new TrecTopicsReader(); //#1
QualityQuery qqs[] = qReader.readQueries( //#1
new BufferedReader(new FileReader(topicsFile))); //#1
Judge judge = new TrecJudge(new BufferedReader( //#2
new FileReader(qrelsFile))); //#2
judge.validateData(qqs, logger); //#3
QualityQueryParser qqParser = new SimpleQQParser("title", "contents"); //#4
QualityBenchmark qrun = new QualityBenchmark(qqs, qqParser, searcher, docNameField);
SubmissionReport submitLog = null;
QualityStats stats[] = qrun.execute(judge, //#5
submitLog, logger);
QualityStats avg = QualityStats.average(stats); //#6
avg.log("SUMMARY",2,logger, " ");
dir.close();
}
}
这是topicsfile的内容
<top>
<num> Number: 0
<title> apache source
<desc> Description:
<narr> Narrative:
</top>
现在,当我运行显示precision和call值为零的源代码时,问题出现了。下面是我运行源代码时的结果
0 - contents:apache contents:source
0 Stats:
Search Seconds: 0.047
DocName Seconds: 0.039
Num Points: 56.000
Num Good Points: 0.000
Max Good Points: 3.000
Average Precision: 0.000
MRR: 0.000
Recall: 0.000
Precision At 1: 0.000
Precision At 2: 0.000
Precision At 3: 0.000
Precision At 4: 0.000
Precision At 5: 0.000
Precision At 6: 0.000
Precision At 7: 0.000
Precision At 8: 0.000
Precision At 9: 0.000
Precision At 10: 0.000
Precision At 11: 0.000
Precision At 12: 0.000
Precision At 13: 0.000
Precision At 14: 0.000
Precision At 15: 0.000
Precision At 16: 0.000
Precision At 17: 0.000
Precision At 18: 0.000
Precision At 19: 0.000
Precision At 20: 0.000
SUMMARY
Search Seconds: 0.047
DocName Seconds: 0.039
Num Points: 56.000
Num Good Points: 0.000
Max Good Points: 3.000
Average Precision: 0.000
MRR: 0.000
Recall: 0.000
Precision At 1: 0.000
Precision At 2: 0.000
Precision At 3: 0.000
Precision At 4: 0.000
Precision At 5: 0.000
Precision At 6: 0.000
Precision At 7: 0.000
Precision At 8: 0.000
Precision At 9: 0.000
Precision At 10: 0.000
Precision At 11: 0.000
Precision At 12: 0.000
Precision At 13: 0.000
Precision At 14: 0.000
Precision At 15: 0.000
Precision At 16: 0.000
Precision At 17: 0.000
Precision At 18: 0.000
Precision At 19: 0.000
Precision At 20: 0.000
现在你能告诉我我做错了什么使精度和召回值为零吗?当精确度和召回率为零时,这意味着什么?我之所以这样做是因为我需要衡量我的搜索引擎的性能,而精确性和召回率是我实现这一目标的方法之一
谢谢,但是精度=0表示您的结果都不正确。例如,见
我建议尝试一个单独的查询,看看您的结果是什么。您可能对标记器有问题;也许你没有把事情写对等等。我认为问题在于索引过程。如果你看 擅长 您将看到对查询中的和的匹配启动了搜索 和文档名(=Lucene在Lucene索引中的name
“filename”
)字段中查找值
这意味着您在编制索引时,需要创建一个显式文档字段,该字段将.txt文件的ID存储在语料库中(在您的情况下是其名称),例如declare
public static final String FIELD_NAME = "filename";
后来
document.add(new TextField(FIELD_NAME, "apache1.0.txt", Field.Store.YES));
其他两个文件也是如此。否则它就不能引用
点击配置文件中的名称。我也有同样的问题,但在添加了新的自定义字段后,它就像一个符咒一样工作:-)
N.B.两个基准配置文件的格式基于TREC9格式;可以在以下位置找到示例qrels.txt
文件:
以及位于的示例topics.txt
文件
public static final String FIELD_NAME = "filename";
document.add(new TextField(FIELD_NAME, "apache1.0.txt", Field.Store.YES));