Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/solr/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java Lucene:generateWordParts vs splitOnCaseChange_Java_Solr_Lucene_Full Text Search_Hibernate Search - Fatal编程技术网

Java Lucene:generateWordParts vs splitOnCaseChange

Java Lucene:generateWordParts vs splitOnCaseChange,java,solr,lucene,full-text-search,hibernate-search,Java,Solr,Lucene,Full Text Search,Hibernate Search,我正在调查WordDelimiterFilterFactory 我混淆了generateWordParts和splitOnCaseChange参数 来自java文档: 生成零件: /** * Causes parts of words to be generated: * <p> * "PowerShot" =&gt; "Power" "Shot" */ public static final int GENERATE_WORD_PARTS = 1

我正在调查WordDelimiterFilterFactory

我混淆了
generateWordParts
splitOnCaseChange
参数

来自java文档:
生成零件

/**
   * Causes parts of words to be generated:
   * <p>
   * "PowerShot" =&gt; "Power" "Shot"
   */
  public static final int GENERATE_WORD_PARTS = 1;
你能举例说明区别吗

附笔。
另外,我不理解子单词的菜单用法:WordDelimiterFilter已经被worddelimiter图形过滤器取代(它可以很好地处理短语查询)

generateWordParts考虑的不仅仅是案例差异。即,
foo-bar
分为
foo
bar
。在这里,按大小写更改拆分不会起任何作用,只留下一个
foo-bar
token

子单词_DELIM引用
types
参数,您可以在其中包含一个文件,该文件定义可以假定哪些字符将令牌拆分为子单词:

types
(optional) The pathname of a file that contains character => type mappings, which enable customization of this filter’s splitting behavior. Recognized character types: LOWER, UPPER, ALPHA, DIGIT, ALPHANUM, and SUBWORD_DELIM.

我假设您可以使用子单词_DELIM字符作为“|”或“.”,如果一个单词包含这两个字符中的任何一个,您可以将其拆分为两个标记。

能否为单词分隔符图形过滤器提供类名?我找不到它,Solr的工厂以通常的方式命名,
Solr.WordDelimiterGraphFilterFactory
。看起来WordDelimiterGraphFilterFactory在hibernate Search中不可用。它的行为与WordDelimiterFilter基本相同,但对短语查询有更好的/实际支持@你真的认为java文档很清晰吗?不过,除了java文件中常量的文档之外,还有更多的文档。来自文档:generateWordParts:(整数,默认值1)如果非零,则在分隔符处拆分单词。例如:“CamelCase”,“hot spot”->“Camel”,“Case”,“hot”,“spot”
types
(optional) The pathname of a file that contains character => type mappings, which enable customization of this filter’s splitting behavior. Recognized character types: LOWER, UPPER, ALPHA, DIGIT, ALPHANUM, and SUBWORD_DELIM.