Lucene 5中奇怪的过滤器行为
在Lucene 5中奇怪的过滤器行为,lucene,Lucene,在lucene5中,Filter被弃用,取而代之的是ConstantQuery包装普通查询对象。我遇到了一个例子,其中来自旧过滤器对象的“已翻译”查询对象不能像我预期的那样工作 val directory = new RAMDirectory() val config = new IndexWriterConfig(new KeywordAnalyzer()) val writer = new IndexWriter(directory, config) writer.addDocument({
lucene5
中,Filter
被弃用,取而代之的是ConstantQuery
包装普通查询对象。我遇到了一个例子,其中来自旧过滤器对象的“已翻译”查询对象不能像我预期的那样工作
val directory = new RAMDirectory()
val config = new IndexWriterConfig(new KeywordAnalyzer())
val writer = new IndexWriter(directory, config)
writer.addDocument({
val document = new Document()
document.add(new StringField("k", "v1", Field.Store.YES))
document.add(new StringField("k", "v2", Field.Store.YES))
document
})
writer.addDocument({
val document = new Document()
document.add(new StringField("k", "v1", Field.Store.YES))
document.add(new StringField("k", "v3", Field.Store.YES))
document
})
writer.commit()
val reader = DirectoryReader.open(directory)
val searcher = new IndexSearcher(reader)
val filter =
new BooleanQuery.Builder().add(
new BooleanQuery.Builder()
.add(new ConstantScoreQuery( new TermQuery( new Term("k", "v1") ) ), BooleanClause.Occur.MUST)
.add(new ConstantScoreQuery( new TermQuery( new Term("k", "v2") ) ), BooleanClause.Occur.MUST_NOT)
.build()
,
BooleanClause.Occur.MUST_NOT
).build()
Console.println("filter: " + filter)
val results = searcher.search(filter, Int.MaxValue)
Console.println("# results: " + results.totalHits)
val filter2 = new BooleanFilter()
filter2.
add({
val inner = new BooleanFilter()
inner add(new TermFilter(new Term("k", "v1")), BooleanClause.Occur.MUST)
inner add(new TermFilter(new Term("k", "v2")), BooleanClause.Occur.MUST_NOT)
inner
}, BooleanClause.Occur.MUST_NOT)
Console.println("filter2: " + filter2)
val results2 = searcher.search(new MatchAllDocsQuery(), filter2, Int.MaxValue)
Console.println("# results2: " + results2.totalHits
控制台中的输出为
filter: -(+ConstantScore(k:v1) -ConstantScore(k:v2))
# results: 0
filter2: BooleanFilter(-BooleanFilter(+k:v1 -k:v2))
# results2: 1
从我的角度来看,我认为
filter
和filter2
在lucene5
中应该是一样的,但结果显然是相反的。我做错了什么?答案似乎来自这篇文章
引述如下:
布尔查询必须至少有一个“正”表达式(即必须或应该)才能匹配。Solr试图对此提供帮助,如果要求执行在最顶层只包含否定子句的布尔查询,它会添加一个匹配所有文档的查询(即:**)
如果顶级BooleanQuery中的某个地方包含一个只包含否定子句的嵌套BooleanQuery,则不会修改该嵌套查询,并且(根据定义)它与任何文档都不匹配——如果需要,则意味着外部查询将不匹配
因此,简言之,我认为我必须在BooleanQuery.Builder
中添加一个MatchAllDocsQuery
,以便至少有一个MUST
或SHOULD
子句,以使查询实际匹配某些内容(否则总是没有)<代码>过滤器按以下方式进行修改即可
val filter =
new BooleanQuery.Builder().add(
new BooleanQuery.Builder()
.add(new ConstantScoreQuery( new TermQuery( new Term("k", "v1") ) ), BooleanClause.Occur.MUST)
.add(new ConstantScoreQuery( new TermQuery( new Term("k", "v2") ) ), BooleanClause.Occur.MUST_NOT)
.build()
,
BooleanClause.Occur.MUST_NOT
).add(new MatchAllDocsQuery(), BooleanClause.Occur.SHOULD).build()