Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/sockets/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 如何使用Hibernate Lucene搜索进行不区分大小写的排序?_Java_Hibernate_Lucene_Hibernate Search - Fatal编程技术网

Java 如何使用Hibernate Lucene搜索进行不区分大小写的排序?

Java 如何使用Hibernate Lucene搜索进行不区分大小写的排序?,java,hibernate,lucene,hibernate-search,Java,Hibernate,Lucene,Hibernate Search,我可以使用以下代码获得结果,但结果排序不正确。它首先显示小写字符,然后显示大写字符 获得的结果: upper test UPPER Test 预期结果 upper UPPER Test test 模式可以是任何类似于大写字母(T)的首字母,也可以是小写字母(T) 以下为参考代码: Prada-实体类: @Entity @Table(name = "Prada") @XmlRootElement @Indexed @AnalyzerDef(name="customanalyzer",

我可以使用以下代码获得结果,但结果排序不正确。它首先显示小写字符,然后显示大写字符

获得的结果:

upper
test
UPPER
Test
预期结果

 upper
 UPPER
 Test
 test 
模式可以是任何类似于大写字母(T)的首字母,也可以是小写字母(T)

以下为参考代码:

Prada-实体类:

@Entity
@Table(name = "Prada")
@XmlRootElement
@Indexed
@AnalyzerDef(name="customanalyzer", tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class), 
    filters = { 
        @TokenFilterDef(factory=ISOLatin1AccentFilterFactory.class),
        @TokenFilterDef(factory=LowerCaseFilterFactory.class)})
public class Prada implements Serializable {
 private static final long serialVersionUID = 1L;
@Id
@Basic(optional = false)
@Column(name = "ID")
private Long id;

@Fields({ @Field(index = Index.YES, store = Store.NO), @Field(name = "PradaName_for_sort", index = Index.YES, analyzer = @Analyzer(definition = "customanalyzer")) })
@Column(name = "NAME", length = 100)
private String name;

public Prada () {
}

public Prada (Long id) {
    this.id = id;
}

public Prada (Long id) {
    this.id = id;

}

public Long getId() {
    return id;
}

public void setId(Long id) {
    this.id = id;
}



public String getName() {
    return name;
}

public void setName(String name) {
    this.name = name;
}


@Override
public String toString() {
    return "com.Prac.Prada[ id=" + id + " ]";
}

}
在某处找到了analyzerDef解决方案,但对我无效。有谁能为我提供解决方案吗

主要代码:

  FullTextEntityManager ftem = Search.getFullTextEntityManager(factory.createEntityManager());
  QueryBuilder qb = ftem.getSearchFactory().buildQueryBuilder().forEntity( Prada.class ).get();
  org.apache.lucene.search.Query query = qb.all().getQuery(); 
  FullTextQuery fullTextQuery = ftem.createFullTextQuery(query, Prada.class);
  fullTextQuery.setSort(new Sort(new SortField("PradaName_for_sort", SortField.STRING, true)));
  fullTextQuery.setFirstResult(0).setMaxResults(150);
  int size = fullTextQuery.getResultSize();
  List<Prada> result = fullTextQuery.getResultList();
  for (Pradauser : result) {
    logger.info("Prada Name:" + user.getName());
  }

切勿使用进行标记化排序的标记器。您需要使用关键字标记器来确保标记保持原样

这是我们在我以前的公司用于分类的分析仪:

    @AnalyzerDef(name = "TEXT_SORT",
        tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class),
        filters = {
                @TokenFilterDef(factory = ASCIIFoldingFilterFactory.class),
                @TokenFilterDef(factory = LowerCaseFilterFactory.class),
                @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
                    @Parameter(name = "pattern", value = "('-&\\.,\\(\\))"),
                    @Parameter(name = "replacement", value = " "),
                    @Parameter(name = "replace", value = "all")
                }),
                @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
                    @Parameter(name = "pattern", value = "([^0-9\\p{L} ])"),
                    @Parameter(name = "replacement", value = ""),
                    @Parameter(name = "replace", value = "all")
                }),
                @TokenFilterDef(factory = TrimFilterFactory.class)
        }
    )

这是最新版本的Hibernate搜索,因此您需要对其进行调整。明显地您需要一个s/ASCIIFoldingFilterFactory/ISOLatin1AccentFilterFactory/但我不确定PatternReplaceFilterFactory是否已经存在于3.6.2中。

您的
customanalyzer
代码在哪里?我没有使用内置过滤器的代码->@AnalyzerDef您是否尝试过像前面提到的那样编写自定义分析器?排序很好,但它确实存在未对挪威语或其他语言中的特殊字符进行排序。你知道如何解决这个问题吗?你的意思是什么时候,它会按照你想要的顺序给出结果,即,
upper,upper,Test,Test
,但它不会对特殊字符进行排序?对吧?谢谢。如何替换上述示例中的挪威字符->Æ、Ø和Å?这里用O表示Ø,用A表示Ø。但替换可能不是好的标准,因为替换后我们需要在列表中获得准确的结果?这正是ASCIIFoldingFilterFactory(以及Lucene 3.6.2中的ISOLatin1AccentFilterFactory)所做的。请参阅。但它并没有像预期的那样工作,因为Ø在A&B之间,应该在M&PCA之间。您可以共享您的更新代码吗?(只是映射部分)您是否如上文所述将StandardTokenizerFactory替换为关键字TokenizerFactory?在描述中共享。我使用的是你给出的相同片段。除挪威字符外,它正在正确显示结果。
@AnalyzerDef(name = "customanalyzer",
tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class),
filters = {
    @TokenFilterDef(factory = ASCIIFoldingFilterFactory.class),
    @TokenFilterDef(factory = LowerCaseFilterFactory.class),
    @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
        @Parameter(name = "pattern", value = "('-&\\.,\\(\\))"),
        @Parameter(name = "replacement", value = " "),
        @Parameter(name = "replace", value = "all")
    }),
    @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
        @Parameter(name = "pattern", value = "([^0-9\\p{L} ])"),
        @Parameter(name = "replacement", value = ""),
        @Parameter(name = "replace", value = "all")
    }),
    @TokenFilterDef(factory = TrimFilterFactory.class)
}
)
public class Prada implements Serializable {

@Fields({ @Field(index = Index.YES, store = Store.YES), @Field(name = "PradaName_for_sort", index = Index.YES, analyzer = @Analyzer(definition = "customanalyzer")) })
@Column(name = "NAME", length = 100)
private String name;
    @AnalyzerDef(name = "TEXT_SORT",
        tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class),
        filters = {
                @TokenFilterDef(factory = ASCIIFoldingFilterFactory.class),
                @TokenFilterDef(factory = LowerCaseFilterFactory.class),
                @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
                    @Parameter(name = "pattern", value = "('-&\\.,\\(\\))"),
                    @Parameter(name = "replacement", value = " "),
                    @Parameter(name = "replace", value = "all")
                }),
                @TokenFilterDef(factory = PatternReplaceFilterFactory.class, params = {
                    @Parameter(name = "pattern", value = "([^0-9\\p{L} ])"),
                    @Parameter(name = "replacement", value = ""),
                    @Parameter(name = "replace", value = "all")
                }),
                @TokenFilterDef(factory = TrimFilterFactory.class)
        }
    )