Lucene 针对title字段的PhraseQuery和针对catch all字段的QueryParser不会产生我期望的结果
如果用户在搜索框中输入一个短语(带引号或不带引号),我希望首先显示的结果是文档标题中包含确切短语的文档,以及后面显示的其他文档。这是我尝试过的,但它无法按顺序提供搜索结果: 在索引过程中,我说:Lucene 针对title字段的PhraseQuery和针对catch all字段的QueryParser不会产生我期望的结果,lucene,lucene.net,Lucene,Lucene.net,如果用户在搜索框中输入一个短语(带引号或不带引号),我希望首先显示的结果是文档标题中包含确切短语的文档,以及后面显示的其他文档。这是我尝试过的,但它无法按顺序提供搜索结果: 在索引过程中,我说: AddStringFieldToDocument(document, "keyWord", this.BuildKeywordsString(), Field.Store.NO, Field.Index.ANALYZED, false); AddStringFieldToDocument(docume
AddStringFieldToDocument(document, "keyWord", this.BuildKeywordsString(), Field.Store.NO, Field.Index.ANALYZED, false);
AddStringFieldToDocument(document, "title", this.Title, Field.Store.NO, Field.Index.ANALYZED, false, 4f);
private void AddStringFieldToDocument(Document document, string fieldName, string fieldValue,
Field.Store store, Field.Index index, bool setOmitTermFreqAndPositions)
{
if (fieldValue == null)
{
return;
}
var field = GetFieldToAddToDocument(document, fieldName, fieldValue, store, index, setOmitTermFreqAndPositions);
document.Add(field);
}
private void AddStringFieldToDocument(Document document, string fieldName, string fieldValue,
Field.Store store, Field.Index index, bool setOmitTermFreqAndPositions, Single boost)
{
if (fieldValue == null)
{
return;
}
var field = GetFieldToAddToDocument(document, fieldName, fieldValue, store, index, setOmitTermFreqAndPositions);
field.SetBoost(boost); //boosting title
document.Add(field);
}
private Field GetFieldToAddToDocument(Document document, string fieldName, string fieldValue, Field.Store store,
Field.Index index, bool setOmitTermFreqAndPositions)
{
Field field = new Field(fieldName, fieldValue, store, index);
field.SetOmitTermFreqAndPositions(setOmitTermFreqAndPositions);
return field;
}
在搜索时,作为布尔查询的一部分,我有:
if (!string.IsNullOrWhiteSpace(queryString))
{
QueryParser qpKeyWord = new QueryParser(myVersionUsed, "keyWord", StandardAnalyzer);
Query qKeyWord = qpKeyWord.Parse(queryString);
booleanQuery.Add(qKeyWord, BooleanClause.Occur.MUST);
Term titleTerm = new Term("title", queryString);
PhraseQuery qTitleWord = new PhraseQuery();
qTitleWord.SetSlop(12);
qTitleWord.Add(titleTerm);
qTitleWord.SetBoost(5);
booleanQuery.Add(qTitleWord, BooleanClause.Occur.SHOULD);
我得到的结果好坏参半。此外,当我运行IndexSearcher.Explain(查询,docId)时
我得到:
Document Id: 92871
0.5439626 = (MATCH) product of:
0.8159439 = (MATCH) sum of:
0.5884751 = (MATCH) sum of:
0.2580064 = (MATCH) weight(KeyWord:chicken in 92871), product of:
0.2226703 = queryWeight(KeyWord:chicken), product of:
3.236447 = idf(docFreq=25345, maxDocs=237239)
0.06880084 = queryNorm
1.158692 = (MATCH) fieldWeight(KeyWord:chicken in 92871), product of:
4.582576 = tf(termFreq(KeyWord:chicken)=21)
3.236447 = idf(docFreq=25345, maxDocs=237239)
0.078125 = fieldNorm(field=KeyWord, doc=92871)
0.3304687 = (MATCH) weight(KeyWord:parmesan in 92871), product of:
0.2962231 = queryWeight(KeyWord:parmesan), product of:
4.305515 = idf(docFreq=8701, maxDocs=237239)
0.06880084 = queryNorm
1.115608 = (MATCH) fieldWeight(KeyWord:parmesan in 92871), product of:
3.316625 = tf(termFreq(KeyWord:parmesan)=11)
4.305515 = idf(docFreq=8701, maxDocs=237239)
0.078125 = fieldNorm(field=KeyWord, doc=92871)
0.2274688 = (MATCH) weight(has_photo:y in 92871), product of:
0.1251001 = queryWeight(has_photo:y), product of:
1.818294 = idf(docFreq=104665, maxDocs=237239)
0.06880084 = queryNorm
1.818294 = (MATCH) fieldWeight(has_photo:y in 92871), product of:
1 = tf(termFreq(has_photo:y)=1)
1.818294 = idf(docFreq=104665, maxDocs=237239)
1 = fieldNorm(field=has_photo, doc=92871)
0.6666667 = coord(2/3)
没有与短语查询关联的编号,但每个关键字都有单独的编号。但是,在运行query.ToString()的搜索时,我得到:
这意味着该查询编写正确。对吗?我遗漏了什么?你建立标题查询的方式我怀疑你永远不会得到来自标题子句的点击 构建短语查询是为了查找单个术语:“鸡肉帕尔马干酪”,但当您为其编制索引时,StandardAnalyzer生成了两个术语:“鸡肉”和“帕尔马干酪”。您需要使用这两个术语构建短语查询 为此,您可以使用QueryParser:
QueryParser qp = new QueryParser("keyWord", new StandardAnalyzer());
Query q = qp.Parse("+(keyWord:chicken KeyWord:parmesan) title:\"Chicken Parmesan\"~12^5.0");
var hits = searcher.Search(q);
如果您不想使用QueryParser,请使用TokenStream api在标记中打断文本:
PhraseQuery titleQuery = new PhraseQuery();
titleQuery.SetSlop(12);
titleQuery.SetBoost(5);
BooleanQuery keywordQuery = new BooleanQuery();
var standard = new StandardAnalyzer();
TokenStream tokens = standard.TokenStream("title", new StringReader("Chicken Parmesan"));
List<Term> terms = new List<Term>();
while (tokens.IncrementToken())
{
TermAttribute termAttribute = (TermAttribute)tokens.GetAttribute(typeof(TermAttribute));
titleQuery.Add(new Term("title", termAttribute.Term()));
keywordQuery.Add(
new TermQuery(
new Term("keyWord", termAttribute.Term())),
BooleanClause.Occur.SHOULD);
}
BooleanQuery query = new BooleanQuery();
query.Add(keywordQuery, BooleanClause.Occur.MUST);
query.Add(titleQuery, BooleanClause.Occur.SHOULD);
var hits = searcher.Search(query);
PhraseQuery titleQuery=新PhraseQuery();
titleQuery.SetSlop(12);
titleQuery.SetBoost(5);
BooleanQuery关键字Query=新建BooleanQuery();
var标准=新的StandardAnalyzer();
TokenStream tokens=standard.TokenStream(“标题”,new StringReader(“鸡肉帕尔玛干酪”);
列表项=新列表();
while(tokens.IncrementToken())
{
TermAttribute=(TermAttribute)令牌.GetAttribute(typeof(TermAttribute));
添加(新术语(“title”,termAttribute.Term());
关键字查询。添加(
新术语查询(
新术语(“关键字”,termAttribute.Term()),
boolean子句。发生。应该);
}
BooleanQuery=新建BooleanQuery();
Add(关键字query,BooleanClause.occurrent.MUST);
Add(titleQuery,BooleanClause.occurrent.SHOULD);
var hits=searcher.Search(查询);
PhraseQuery titleQuery = new PhraseQuery();
titleQuery.SetSlop(12);
titleQuery.SetBoost(5);
BooleanQuery keywordQuery = new BooleanQuery();
var standard = new StandardAnalyzer();
TokenStream tokens = standard.TokenStream("title", new StringReader("Chicken Parmesan"));
List<Term> terms = new List<Term>();
while (tokens.IncrementToken())
{
TermAttribute termAttribute = (TermAttribute)tokens.GetAttribute(typeof(TermAttribute));
titleQuery.Add(new Term("title", termAttribute.Term()));
keywordQuery.Add(
new TermQuery(
new Term("keyWord", termAttribute.Term())),
BooleanClause.Occur.SHOULD);
}
BooleanQuery query = new BooleanQuery();
query.Add(keywordQuery, BooleanClause.Occur.MUST);
query.Add(titleQuery, BooleanClause.Occur.SHOULD);
var hits = searcher.Search(query);