Lucene 5中奇怪的过滤器行为

Lucene 5中奇怪的过滤器行为,lucene,Lucene,在lucene5中,Filter被弃用,取而代之的是ConstantQuery包装普通查询对象。我遇到了一个例子,其中来自旧过滤器对象的“已翻译”查询对象不能像我预期的那样工作 val directory = new RAMDirectory() val config = new IndexWriterConfig(new KeywordAnalyzer()) val writer = new IndexWriter(directory, config) writer.addDocument({

lucene5
中,
Filter
被弃用,取而代之的是
ConstantQuery
包装普通查询对象。我遇到了一个例子,其中来自旧过滤器对象的“已翻译”查询对象不能像我预期的那样工作

val directory = new RAMDirectory()
val config = new IndexWriterConfig(new KeywordAnalyzer())
val writer = new IndexWriter(directory, config)
writer.addDocument({
  val document = new Document()
  document.add(new StringField("k", "v1", Field.Store.YES))
  document.add(new StringField("k", "v2", Field.Store.YES))
  document
})
writer.addDocument({
  val document = new Document()
  document.add(new StringField("k", "v1", Field.Store.YES))
  document.add(new StringField("k", "v3", Field.Store.YES))
  document
})
writer.commit()

val reader = DirectoryReader.open(directory)
val searcher = new IndexSearcher(reader)

val filter =
  new BooleanQuery.Builder().add(
    new BooleanQuery.Builder()
      .add(new ConstantScoreQuery( new TermQuery( new Term("k", "v1") ) ), BooleanClause.Occur.MUST)
      .add(new ConstantScoreQuery( new TermQuery( new Term("k", "v2") ) ), BooleanClause.Occur.MUST_NOT)
      .build()
    ,
    BooleanClause.Occur.MUST_NOT
  ).build()

Console.println("filter: " + filter)
val results = searcher.search(filter, Int.MaxValue)
Console.println("# results: " + results.totalHits)

val filter2 = new BooleanFilter()

filter2.
  add({
    val inner = new BooleanFilter()
    inner add(new TermFilter(new Term("k", "v1")), BooleanClause.Occur.MUST)
    inner add(new TermFilter(new Term("k", "v2")), BooleanClause.Occur.MUST_NOT)
    inner
  }, BooleanClause.Occur.MUST_NOT)

Console.println("filter2: " + filter2)
val results2 = searcher.search(new MatchAllDocsQuery(), filter2, Int.MaxValue)
Console.println("# results2: " + results2.totalHits
控制台中的输出为

filter: -(+ConstantScore(k:v1) -ConstantScore(k:v2))
# results: 0
filter2: BooleanFilter(-BooleanFilter(+k:v1 -k:v2))
# results2: 1

从我的角度来看,我认为
filter
filter2
lucene5
中应该是一样的,但结果显然是相反的。我做错了什么?

答案似乎来自这篇文章

引述如下:

布尔查询必须至少有一个“正”表达式(即必须或应该)才能匹配。Solr试图对此提供帮助,如果要求执行在最顶层只包含否定子句的布尔查询,它会添加一个匹配所有文档的查询(即:**)

如果顶级BooleanQuery中的某个地方包含一个只包含否定子句的嵌套BooleanQuery,则不会修改该嵌套查询,并且(根据定义)它与任何文档都不匹配——如果需要,则意味着外部查询将不匹配

因此,简言之,我认为我必须在
BooleanQuery.Builder
中添加一个
MatchAllDocsQuery
,以便至少有一个
MUST
SHOULD
子句,以使查询实际匹配某些内容(否则总是没有)<代码>过滤器按以下方式进行修改即可

val filter =
  new BooleanQuery.Builder().add(
    new BooleanQuery.Builder()
      .add(new ConstantScoreQuery( new TermQuery( new Term("k", "v1") ) ), BooleanClause.Occur.MUST)
      .add(new ConstantScoreQuery( new TermQuery( new Term("k", "v2") ) ), BooleanClause.Occur.MUST_NOT)
      .build()
    ,
    BooleanClause.Occur.MUST_NOT
  ).add(new MatchAllDocsQuery(), BooleanClause.Occur.SHOULD).build()