如何在Lucene中为类似文档评分?
我想在Lucene中为类似的文档打分。让我解释一下我的情况 例如,假设我的文件中有以下记录,我在这些记录上创建了索引 ID|First Name|Last Name|DOB 1 |John |Doe |03/18/1990 1 |John |Twain |03/18/1990 3 |Joey |Johnson |05/14/1978 3 |Joey |Johnson |05/14/1987 4 |Joey |Johnson |05/14/1987 如果你有任何问题,请告诉我 我正在努力学习过去两周的Lucene,所以我对它了解不多如何在Lucene中为类似文档评分?,lucene,lucene.net,similarity,morelikethis,Lucene,Lucene.net,Similarity,Morelikethis,我想在Lucene中为类似的文档打分。让我解释一下我的情况 例如,假设我的文件中有以下记录,我在这些记录上创建了索引 ID|First Name|Last Name|DOB 1 |John |Doe |03/18/1990 1 |John |Twain |03/18/1990 3 |Joey |Johnson |05/14/1978 3 |Joey |Johnson |05/14/1987 4 |Joey |Johnson
注意:我正在使用Lucene.Net 3.0.3您能显示方法
QueryMaker()
的代码吗
我认为您可以创建一个新字段
“name”
,它由firstname和lastname组成,您可以使用FuzzyQuery
在新字段中搜索FuzzyQuery
是根据字符串的levenshtein距离对文档进行评分。sa和searchfield的值是什么?sa是用户输入的查询。String sa=textbox1.Text和String String[]SearchField=new String[]{“ID”、“First Name”、“Last Name”、“DOB”};
ID|First Name|Last Name|DOB
1 |John |Doe |03/18/1990
3 |Joey |Johnson |05/14/1978
3 |Joey |Johnson |05/14/1987
4 |Joey |Johnson |05/14/1987
1 |John |Twain |03/18/1990
2 |Daniel |Doe |03/25/1989
ID|First Name|Last Name|DOB
1 |John |Doe |03/18/1990
1 |John |Twain |03/18/1990
3 |Joey |Johnson |05/14/1978
3 |Joey |Johnson |05/14/1987
4 |Joey |Johnson |05/14/1987
2 |Daniel |Doe |03/25/1989
String sa=textbox1.Text; // Assume this value to be John Doe in this case.
String[] searchfield= new string[] { "ID", "First Name", "Last Name","DOB"};
IndexReader reader = IndexReader.Open(dir, true);
TopScoreDocCollector coll = TopScoreDocCollector.Create(50, true);
indexSearcher.Search(QueryMaker(sa, searchfield), coll);
ScoreDoc[] hits = coll.TopDocs().ScoreDocs;
for (int i = 0; i < hits.Length; i++)
{
SearchResults result = new SearchResults();
int docID = hits[i].Doc;
Document d = indexSearcher.Doc(docID);
result.fname=d.Get("First Name").ToString();
}
IndexSearcher mltsearcher = new IndexSearcher(reader);
MoreLikeThis mlt = new MoreLikeThis(reader);
int docid =hits[1].Doc;
Query query = mlt.Like(docid);
TopDocs similardocs = mltsearcher.Search(query, 10);