Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/solr/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Indexing 过滤器对solr中搜索结果的影响_Indexing_Solr_Query Analyzer - Fatal编程技术网

Indexing 过滤器对solr中搜索结果的影响

Indexing 过滤器对solr中搜索结果的影响,indexing,solr,query-analyzer,Indexing,Solr,Query Analyzer,当我在solr中查询优雅时,我也会得到优雅的结果 我使用这些过滤器进行索引分析 WhitespaceTokenizerFactory StopFilterFactory WordDelimiterFilterFactory LowerCaseFilterFactory SynonymFilterFactory EnglishPorterFilterFactory RemoveDuplicatesTokenFilterFactory ReversedWildcardFilterFactory 对

当我在solr中查询优雅时,我也会得到优雅的结果

我使用这些过滤器进行索引分析

WhitespaceTokenizerFactory
StopFilterFactory
WordDelimiterFilterFactory
LowerCaseFilterFactory
SynonymFilterFactory
EnglishPorterFilterFactory
RemoveDuplicatesTokenFilterFactory
ReversedWildcardFilterFactory
对于查询分析:

WhitespaceTokenizerFactory
SynonymFilterFactory
StopFilterFactory
WordDelimiterFilterFactory
LowerCaseFilterFactory
EnglishPorterFilterFactory
RemoveDuplicatesTokenFilterFactory 
我想知道哪个过滤器会影响我的搜索结果

EnglishPorterFilterFactory

这是简单的答案

更多信息:

英语波特是指英语波特词干分析器,词干为alogrithm。根据词干分析器,优雅和优雅都有相同的词干,词根生成器是一个启发式单词

您可以在线验证这一点,例如。基本上,您将看到eleg ant和eleg ance词干相同>eleg

Solr来源:

       public void inform(ResourceLoader loader) {
            String wordFiles = args.get(PROTECTED_TOKENS);
            if (wordFiles != null) {
                try {
下面就是protwords文件的作用:

                    File protectedWordFiles = new File(wordFiles);
                    if (protectedWordFiles.exists()) {
                        List<String> wlist = loader.getLines(wordFiles);
                        //This cast is safe in Lucene
                        protectedWords = new CharArraySet(wlist, false);//No need to go through StopFilter as before, since it just uses a List internally
                    } else {
                        List<String> files = StrUtils
                                .splitFileNames(wordFiles);
                        for (String file : files) {
                            List<String> wlist = loader.getLines(file
                                    .trim());
                            if (protectedWords == null)
                                protectedWords = new CharArraySet(wlist,
                                        false);
                            else
                                protectedWords.addAll(wlist);
                        }
                    }
                } catch (IOException e) {
                    throw new RuntimeException(e);
                }
            }
        }

@fyr:是的,我使用了solr adimn页面来查看效果:,但englishPorterFilter使用了portwords.txt,其中我没有包含任何内容。那么它是如何做到的呢?portwords.txtno的用途是什么?它只对您修复的词干使用portwords。它是启发式的,所以会出错。英语波特算法使用雪球库。我使用它作为:那么什么是portwords.txt这里看我的编辑。保护词是没有词干的词。保护词
        public EnglishPorterFilter create(TokenStream input) {
            return new EnglishPorterFilter(input, protectedWords);
        }

    }

    /**
     * English Porter2 filter that doesn't use reflection to
     * adapt lucene to the snowball stemmer code.
     */
    @Deprecated
    class EnglishPorterFilter extends SnowballPorterFilter {
        public EnglishPorterFilter(TokenStream source,
                CharArraySet protWords) {
            super (source, new org.tartarus.snowball.ext.EnglishStemmer(),
                    protWords);
        }
    }