Java 如何在Lucene 4.1中索引和搜索数字
在我的3.6代码中,我将数字字段添加到索引中,如下所示:Java 如何在Lucene 4.1中索引和搜索数字,java,search,lucene,Java,Search,Lucene,在我的3.6代码中,我将数字字段添加到索引中,如下所示: public void addNumericField(IndexField field, Integer value) { addField(field, NumericUtils.intToPrefixCoded(value)); } 但是现在你需要给它传递一个BytesRef参数,它完全不清楚你下一步要用这个值做什么,所以我把它改为(工作进行中) 哪个看起来更整洁 在3.6中,我还添加了override q
public void addNumericField(IndexField field, Integer value) {
addField(field, NumericUtils.intToPrefixCoded(value));
}
但是现在你需要给它传递一个BytesRef参数,它完全不清楚你下一步要用这个值做什么,所以我把它改为(工作进行中)
哪个看起来更整洁
在3.6中,我还添加了override queryparser,使其适用于数值范围搜索
package org.musicbrainz.search.servlet;
import org.apache.lucene.index.Term;
import org.apache.lucene.queryparser.classic.MultiFieldQueryParser;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.search.TermRangeQuery;
import org.apache.lucene.util.NumericUtils;
import org.musicbrainz.search.LuceneVersion;
import org.musicbrainz.search.index.LabelIndexField;
import org.musicbrainz.search.servlet.mmd1.LabelType;
public class LabelQueryParser extends MultiFieldQueryParser {
public LabelQueryParser(java.lang.String[] strings, org.apache.lucene.analysis.Analyzer analyzer)
{
super(LuceneVersion.LUCENE_VERSION, strings, analyzer);
}
protected Query newTermQuery(Term term) {
if(
(term.field() == LabelIndexField.CODE.getName())
){
try {
int number = Integer.parseInt(term.text());
TermQuery tq = new TermQuery(new Term(term.field(), NumericUtils.intToPrefixCoded(number)));
return tq;
}
catch (NumberFormatException nfe) {
//If not provided numeric argument just leave as is,
//won't give matches
return super.newTermQuery(term);
}
} else {
return super.newTermQuery(term);
}
}
/**
*
* Convert Numeric Fields
*
* @param field
* @param part1
* @param part2
* @param inclusive
* @return
*/
@Override
public Query newRangeQuery(String field,
String part1,
String part2,
boolean inclusive) {
if (
(field.equals(LabelIndexField.CODE.getName()))
)
{
part1 = NumericUtils.intToPrefixCoded(Integer.parseInt(part1));
part2 = NumericUtils.intToPrefixCoded(Integer.parseInt(part2));
}
TermRangeQuery query = (TermRangeQuery)
super.newRangeQuery(field, part1, part2,inclusive);
return query;
}
}
所以我把所有这些都算出来,以为我不再需要它了,但不幸的是,这个IntField上的查询现在都不起作用了
进一步阅读,似乎Intfields仅用于范围查询,因此我不知道如何进行匹配查询,以及NumericRangeQuery是否与我正在使用的经典查询解析器兼容
于是我又开始尝试将我的数值添加为编码字符串
public void addNumericField(IndexField field, Integer value) {
FieldType fieldType = new FieldType();
fieldType.setStored(true);
fieldType.setIndexed(true);
BytesRef bytes = new BytesRef(NumericUtils.BUF_SIZE_INT);
NumericUtils.intToPrefixCoded(value, 0, bytes);
doc.add(new Field(field.getName(),bytes, fieldType));
}
但在运行时,我现在得到了错误
java.lang.IllegalArgumentException: Fields with BytesRef values cannot be indexed
但是我需要索引字段,所以请告诉我如何像3.6中那样索引数字字段,以便我可以搜索它们。所以我已经完成了这项工作,这是否是做我不知道的事情的最佳方法
FieldType fieldType = new FieldType();
fieldType.setStored(true);
fieldType.setIndexed(true);
BytesRef bytes = new BytesRef(NumericUtils.BUF_SIZE_INT);
NumericUtils.intToPrefixCoded(value, 0, bytes);
doc.add(new Field(field.getName(),bytes.utf8ToString(), fieldType));
protected Query newTermQuery(Term term)
{
if (term.field().equals(LabelIndexField.CODE.getName()))
{
try
{
int number = Integer.parseInt(term.text());
BytesRef bytes = new BytesRef(NumericUtils.BUF_SIZE_INT);
NumericUtils.intToPrefixCoded(number, 0, bytes);
TermQuery tq = new TermQuery(new Term(term.field(), bytes.utf8ToString()));
return tq;
}
catch (NumberFormatException nfe)
{
//If not provided numeric argument just leave as is, won't give matches
return super.newTermQuery(term);
}
}
else
{
return super.newTermQuery(term);
}
}
public Query newRangeQuery(String field,
String part1,
String part2,
boolean startInclusive,
boolean endInclusive)
{
if (
(field.equals(LabelIndexField.CODE.getName()))
)
{
BytesRef bytes1 = new BytesRef(NumericUtils.BUF_SIZE_INT);
BytesRef bytes2 = new BytesRef(NumericUtils.BUF_SIZE_INT);
NumericUtils.intToPrefixCoded(Integer.parseInt(part1), 0, bytes1);
NumericUtils.intToPrefixCoded(Integer.parseInt(part2), 0, bytes2);
part1 = bytes1.utf8ToString();
part2 = bytes2.utf8ToString();
}
TermRangeQuery query = (TermRangeQuery)
super.newRangeQuery(field, part1, part2, startInclusive, endInclusive);
return query;
}
NumericUtils.prefixCodedToInt(new BytesRef(code))
只需使用适当的字段。例如
IntField
,LongField
,等等
例如,见
有关查询这些字段的信息,请参见《如何使用lucene 4.7》的提示: 编制索引时,我只需执行以下操作:
document.add(new IntField("int_field", int_value, Field.Store.YES));
以及查询:
public class MyQueryParser extends QueryParser {
public MyQueryParser(Version matchVersion, String field, Analyzer anlayzer) {
super(matchVersion, field, anlayzer);
}
@Override
protected Query getRangeQuery(String field, String part1, String part2, boolean startInclusive, boolean endInclusive) throws ParseException {
if ("int_field".equals(field)) {
return NumericRangeQuery.newIntRange(field, Integer.parseInt(part1), Integer.parseInt(part2), startInclusive, endInclusive);
} else {
return super.getRangeQuery(field, part1, part2, startInclusive, endInclusive);
}
}
@Override
protected Query newTermQuery(Term term)
{
if ("int_field".equals(term.field())) {
try {
int number = Integer.parseInt(term.text());
BytesRef bytes = new BytesRef(NumericUtils.BUF_SIZE_INT);
NumericUtils.intToPrefixCoded(number, 0, bytes);
TermQuery tq = new TermQuery(new Term(term.field(), bytes.utf8ToString()));
return tq;
} catch (NumberFormatException nfe) {
//If not provided numeric argument just leave as is, won't give matches
return super.newTermQuery(term);
}
} else {
return super.newTermQuery(term);
}
}
}
通过这样做,querys喜欢
int_field: 1
int_field: [1 TO 5]
按预期工作。但如果使用IntField,如何扩展QueryParser来搜索此类字段?请将解析值中相应的NumericRangeQuery返回为整数。因此,如果需要,可以在newTermQuery方法中返回NumericRangeQuery,而不是返回TermQuery。
int_field: 1
int_field: [1 TO 5]