如何向hibernate search/lucene索引添加年份的最后两位数字

如何向hibernate search/lucene索引添加年份的最后两位数字,lucene,hibernate-search,Lucene,Hibernate Search,在我的数据库中,我以完整的形式存储年份。例如,201220132014等。这也是它们在我的索引中的存储方式。我还希望在索引中存储最后两位数字。示例12,13,14等。我基本上希望个人能够在2012年和2012年进行关键字搜索 我的主搜索分析器如下所示 @AnalyzerDefs({ @AnalyzerDef(name = "searchtokenanalyzer", // Split input into tokens according to tokenizer

在我的数据库中,我以完整的形式存储年份。例如,
201220132014
等。这也是它们在我的索引中的存储方式。我还希望在索引中存储最后两位数字。示例
12,13,14
等。我基本上希望个人能够在2012年和2012年进行关键字搜索

我的主搜索分析器如下所示

@AnalyzerDefs({
    @AnalyzerDef(name = "searchtokenanalyzer",
            // Split input into tokens according to tokenizer
            tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class),
            filters = {
                @TokenFilterDef(factory = LowerCaseFilterFactory.class),
                @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
                    @Parameter(name = "pattern", value = "([^a-zA-Z0-9\\-])"),
                    @Parameter(name = "replacement", value = ""),
                    @Parameter(name = "replace", value = "all")}),
                @TokenFilterDef(factory = StopFilterFactory.class),
                @TokenFilterDef(factory = TrimFilterFactory.class)
            }),
@AnalyzerDef(name = "yearanalyzer",
            // Split input into tokens according to tokenizer
            // Split input into tokens according to tokenizer
            tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class),
            filters = {
                @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
                    @Parameter(name = "pattern", value = "^.{2}"),
                    @Parameter(name = "replacement", value = ""),
                    @Parameter(name = "replace", value = "all")}),
                @TokenFilterDef(factory = StopFilterFactory.class),
                @TokenFilterDef(factory = TrimFilterFactory.class)
            })
我有第二个分析器来处理年份缩写,看起来像这样

@AnalyzerDefs({
    @AnalyzerDef(name = "searchtokenanalyzer",
            // Split input into tokens according to tokenizer
            tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class),
            filters = {
                @TokenFilterDef(factory = LowerCaseFilterFactory.class),
                @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
                    @Parameter(name = "pattern", value = "([^a-zA-Z0-9\\-])"),
                    @Parameter(name = "replacement", value = ""),
                    @Parameter(name = "replace", value = "all")}),
                @TokenFilterDef(factory = StopFilterFactory.class),
                @TokenFilterDef(factory = TrimFilterFactory.class)
            }),
@AnalyzerDef(name = "yearanalyzer",
            // Split input into tokens according to tokenizer
            // Split input into tokens according to tokenizer
            tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class),
            filters = {
                @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
                    @Parameter(name = "pattern", value = "^.{2}"),
                    @Parameter(name = "replacement", value = ""),
                    @Parameter(name = "replace", value = "all")}),
                @TokenFilterDef(factory = StopFilterFactory.class),
                @TokenFilterDef(factory = TrimFilterFactory.class)
            })
在我的实体字段中,我有以下内容

@Entity
@Indexed
public class YearLookup 
    @Fields({
            @Field(name = "name", store = Store.NO, index = Index.YES,
                    analyze = Analyze.YES, analyzer = @Analyzer(definition = "searchtokenanalyzer")),
            @Field(name = "abbr", store = Store.NO, index = Index.YES, 
                    analyze = Analyze.YES, analyzer = @Analyzer(definition = "yearanalyzer"))
        })
        private String name;
    }
现在到目前为止,索引中的所有内容都是正确的,我可以看到

name 2012,2013,2014
abbr 12,13,14
现在,当我使用以下代码对YearLookup.class进行搜索时。缩写年份再次减少两位数,创建一个空值,而名称保持不变

public interface SearchParam {
    public static final String[] SEARCH_FIELDS = new String[]{"yearLookup.name", "yearLookup.abbr"};
}

String searchString = "14";

QueryBuilder queryBuilder = fullTextSession.getSearchFactory().buildQueryBuilder().forEntity(YearLookup.class).get();

ermMatchingContext onWildCardFields = queryBuilder.keyword().wildcard().onField(SearchParam.SEARCH_FIELDS[0]);
            TermMatchingContext onFuzzyFields = queryBuilder.keyword().fuzzy().withThreshold(0.7f)
                    .withPrefixLength(1).onField(SearchParam.SEARCH_FIELDS[0]);

            //Iterate over all the remaining search fields stored in the "VehicleListing" index 
            for (int i = 1; i < SearchParam.SEARCH_FIELDS.length; i++) {
                onWildCardFields.andField(SearchParam.SEARCH_FIELDS[i]);
                onFuzzyFields.andField(SearchParam.SEARCH_FIELDS[i]);
            }

            String[] tokens = searchString.toLowerCase().split("\\s");

            for (String token : tokens) {
                luceneQuery = queryBuilder.bool()
                        .should(onWildCardFields.matching(token + "*").createQuery())
                        .should(onFuzzyFields.matching(token).createQuery())
                        .createQuery();
            }

FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery(luceneQuery, YearLookup.class);

Integer results = fullTextQuery.getResultSize();
公共接口SearchParam{
公共静态最终字符串[]搜索\字段=新字符串[]{“yearLookup.name”,“yearLookup.abbr”};
}
字符串searchString=“14”;
QueryBuilder QueryBuilder=fullTextSession.getSearchFactory().buildQueryBuilder().forEntity(YearLookup.class).get();
ermMatchingContext onWildCardFields=queryBuilder.keyword().wildcard().onField(SearchParam.SEARCH_字段[0]);
TermMatchingContext onFuzzyFields=queryBuilder.keyword().fuzzy().withThreshold(0.7f)
.带有前缀长度(1).onField(SearchParam.SEARCH_字段[0]);
//迭代存储在“VehicleListing”索引中的所有剩余搜索字段
for(int i=1;i
现在,当我针对这个运行测试用例时。我得到以下例外

HSEARCH000146:应用于字段“yearLookup.abbr”的查询字符串“14”没有要匹配的有意义的完整标记。根据应用于此字段的分析器验证查询输入。 org.hibernate.search.errors.EmptyQueryException 在org.hibernate.search.query.dsl.impl.ConnectedMultiFieldsTermQueryBuilder.createQuery(ConnectedMultiFieldsTermQueryBuilder.java:111) 在org.hibernate.search.query.dsl.impl.ConnectedMultiFieldsTermQueryBuilder.createQuery(ConnectedMultiFieldsTermQueryBuilder.java:86) 位于com.domain.auto.services.search.impl.SearchManagerImpl.doSearch(SearchManagerImpl.java:146) 位于$SearchManager_138fdc525111b303.doSearch(未知来源) 位于$SearchManager_138fdc525111b2f3.doSearch(未知来源) 位于com.domain.auto.services.search.impl.SearchServiceImplTest.testYearSearch(SearchServiceImplTest.java:92)

有人有什么想法吗?

解决方案

@AnalyzerDef(name = "yearanalyzer",
        // Split input into tokens according to tokenizer
        // Split input into tokens according to tokenizer
        tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class),
        filters = {
            @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
                @Parameter(name = "pattern", value = "^\\d{2}(\\d{2})$"),
                @Parameter(name = "replacement", value = "$1"),
                @Parameter(name = "replace", value = "all")}),
        })

为这两种情况创建桥和句柄字符串,如下所示:

 @FieldBridge(impl = YearFieldBridge.class)
 private String name;
并创建与此类似的桥接类:

public class YearFieldBridge implements StringBridge, Serializable {
    private static final long serialVersionUID = 1L;
    @Override
    public String objectToString(Object value) {
        if(value != null) {
            if(value instanceof String) {
                String strVal = (String) value;
                strVal = strVal.toUpperCase();
                if(strVal.length() == 2){
                    return "20"+strVal;
                }else{
                    return strVal;
                }
            }
        }
        return null;
    }
}