Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/303.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java lucene中的精度和召回测量问题_Java_Lucene - Fatal编程技术网

Java lucene中的精度和召回测量问题

Java lucene中的精度和召回测量问题,java,lucene,Java,Lucene,我需要在lucene中计算精度和召回值,我使用这个源代码来实现这一点 public class PrecisionRecall { public static void main(String[] args) throws Throwable { File topicsFile = new File("C:/Users/Raden/Documents/lucene/LuceneHibernate/LIA/lia2e/src/lia/benchmark/topics.txt"); File q

我需要在lucene中计算精度和召回值,我使用这个源代码来实现这一点

public class PrecisionRecall {

public static void main(String[] args) throws Throwable {

File topicsFile = new File("C:/Users/Raden/Documents/lucene/LuceneHibernate/LIA/lia2e/src/lia/benchmark/topics.txt");
File qrelsFile = new File("C:/Users/Raden/Documents/lucene/LuceneHibernate/LIA/lia2e/src/lia/benchmark/qrels.txt");
Directory dir = FSDirectory.open(new File("C:/Users/Raden/Documents/myindex"));
Searcher searcher = new IndexSearcher(dir, true);

String docNameField = "filename"; 

PrintWriter logger = new PrintWriter(System.out, true); 

TrecTopicsReader qReader = new TrecTopicsReader();   //#1
QualityQuery qqs[] = qReader.readQueries(            //#1
    new BufferedReader(new FileReader(topicsFile))); //#1

Judge judge = new TrecJudge(new BufferedReader(      //#2
    new FileReader(qrelsFile)));                     //#2

judge.validateData(qqs, logger);                     //#3

QualityQueryParser qqParser = new SimpleQQParser("title", "contents");  //#4

QualityBenchmark qrun = new QualityBenchmark(qqs, qqParser, searcher, docNameField);
SubmissionReport submitLog = null;
QualityStats stats[] = qrun.execute(judge,           //#5
        submitLog, logger);

QualityStats avg = QualityStats.average(stats);      //#6
avg.log("SUMMARY",2,logger, "  ");
dir.close();
}
}
这是topicsfile的内容

 <top>
<num> Number: 0
<title> apache source
<desc> Description:
<narr> Narrative:
</top>
现在,当我运行显示precision和call值为零的源代码时,问题出现了。下面是我运行源代码时的结果

0  -  contents:apache contents:source

0 Stats:
Search Seconds:         0.047
DocName Seconds:        0.039
Num Points:            56.000
Num Good Points:        0.000
Max Good Points:        3.000
Average Precision:      0.000
MRR:                    0.000
Recall:                 0.000
Precision At 1:         0.000
Precision At 2:         0.000
Precision At 3:         0.000
Precision At 4:         0.000
Precision At 5:         0.000
Precision At 6:         0.000
Precision At 7:         0.000
Precision At 8:         0.000
Precision At 9:         0.000
Precision At 10:        0.000
Precision At 11:        0.000
Precision At 12:        0.000
Precision At 13:        0.000
Precision At 14:        0.000
Precision At 15:        0.000
Precision At 16:        0.000
Precision At 17:        0.000
Precision At 18:        0.000
Precision At 19:        0.000
Precision At 20:        0.000



SUMMARY
Search Seconds:         0.047
DocName Seconds:        0.039
Num Points:            56.000
Num Good Points:        0.000
Max Good Points:        3.000
Average Precision:      0.000
MRR:                    0.000
Recall:                 0.000
Precision At 1:         0.000
Precision At 2:         0.000
Precision At 3:         0.000
Precision At 4:         0.000
Precision At 5:         0.000
Precision At 6:         0.000
Precision At 7:         0.000
Precision At 8:         0.000
Precision At 9:         0.000
Precision At 10:        0.000
Precision At 11:        0.000
Precision At 12:        0.000
Precision At 13:        0.000
Precision At 14:        0.000
Precision At 15:        0.000
Precision At 16:        0.000
Precision At 17:        0.000
Precision At 18:        0.000
Precision At 19:        0.000
Precision At 20:        0.000
现在你能告诉我我做错了什么使精度和召回值为零吗?当精确度和召回率为零时,这意味着什么?我之所以这样做是因为我需要衡量我的搜索引擎的性能,而精确性和召回率是我实现这一目标的方法之一


谢谢,但是精度=0表示您的结果都不正确。例如,见


我建议尝试一个单独的查询,看看您的结果是什么。您可能对标记器有问题;也许你没有把事情写对等等。

我认为问题在于索引过程。如果你看 擅长

您将看到对查询中的的匹配启动了搜索 和文档名(=Lucene在Lucene索引中的name
“filename”
)字段中查找值

这意味着您在编制索引时,需要创建一个显式文档字段,该字段将.txt文件的ID存储在语料库中(在您的情况下是其名称),例如declare

public static final String FIELD_NAME = "filename";
后来

document.add(new TextField(FIELD_NAME, "apache1.0.txt", Field.Store.YES));
其他两个文件也是如此。否则它就不能引用 点击配置文件中的名称。我也有同样的问题,但在添加了新的自定义字段后,它就像一个符咒一样工作:-)

N.B.两个基准配置文件的格式基于TREC9格式;可以在以下位置找到示例
qrels.txt
文件: 以及位于的示例
topics.txt
文件

public static final String FIELD_NAME = "filename";
document.add(new TextField(FIELD_NAME, "apache1.0.txt", Field.Store.YES));