Solr FuzzyLookupFactory exactMatch区分大小写
这可能是一个重复的问题,但找不到与此相关的内容: 我已经为城市和地区列表实施了solr suggester。我有用户FuzzyLookupFactory。我的模式如下所示:Solr FuzzyLookupFactory exactMatch区分大小写,solr,lucene,autosuggest,fuzzy-search,search-suggestion,Solr,Lucene,Autosuggest,Fuzzy Search,Search Suggestion,这可能是一个重复的问题,但找不到与此相关的内容: 我已经为城市和地区列表实施了solr suggester。我有用户FuzzyLookupFactory。我的模式如下所示: <fieldType name="suggestTypeLc" class="solr.TextField" positionIncrementGap="100"> <analyzer> <charFilter class="solr.PatternReplaceCharFilterFactor
<fieldType name="suggestTypeLc" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="[^a-zA-Z0-9]" replacement=" " />
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">suggestions</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">searchfield</str>
<str name="weightField">searchscore</str>
<str name="suggestAnalyzerFieldType">suggestTypeLc</str>
<str name="buildOnStartup">false</str>
<str name="buildOnCommit">false</str>
<str name="storeDir">autosuggest_dict</str>
</lst>
</searchComponent>
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.count">10</str>
<str name="suggest.dictionary">suggestions</str>
<str name="suggest.dictionary">results</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
鉴于,calling/suggest?suggest.q=孟买(以大写“M”开头)
在第一名给出准确的结果:
{
"responseHeader":{
"status":0,
"QTime":16},
"suggest":{
"suggestions":{
"Mumbai":{
"numFound":10,
"suggestions":[{
"term":"Mumbai",
"weight":2248},
{
"term":"Mumbai Domestic Airport",
"weight":11536},
{
"term":"Mumbai Chhatrapati Shivaji Intl Airport",
"weight":11376},
{
"term":"Mumbai Pune Highway",
"weight":2850},
...
我错过了什么?如何才能使孟买成为第一个结果,即使它是从一个小写的“孟买”作为查询。我以为大小写敏感度是由我生成的“suggestTypeLc”字段处理的。is
exactMatchFirst
有一个隐藏的配置参数,它被描述为:
如果为true,则会首先返回默认的精确建议,即使它们是前缀或FST中的其他字符串具有更大的权重
根据您的配置,建议按searchscore
字段排序(在您的配置中,它指的是:searchscore
)。这就是为什么当您作为孟买
查询时,所有建议都按权重排序
但根据exactMatchFirst=true
的规定,尽管提供了加权机制,但顶部仍会有Mumbai
(对于查询=Mumbai
)。这就是exactMatchFirst
对排序的影响
不幸的是,我没有找到调整您的建议的选项,而不是完全摆脱weightField
例如,请尝试关闭“按字段加权”,或者尝试其他查找实现。权重将相同,因为它与我查询的数据集相同。唯一的区别是,在第一种情况下,我用“孟买”表示质疑,在第二种情况下,我用“孟买”表示质疑。这进一步表明exactMatch在这里是“true”,但它不处理大小写差异。如果我搜索“oran”与“oran”,我在“oran”中不会得到任何结果,而“oran”在第一位以零权重获得精确匹配:(如果为true,则首先返回默认的精确建议,即使它们是前缀或FST中的其他字符串具有更大的权重。这正是我需要的,但不区分大小写。
{
"responseHeader":{
"status":0,
"QTime":16},
"suggest":{
"suggestions":{
"Mumbai":{
"numFound":10,
"suggestions":[{
"term":"Mumbai",
"weight":2248},
{
"term":"Mumbai Domestic Airport",
"weight":11536},
{
"term":"Mumbai Chhatrapati Shivaji Intl Airport",
"weight":11376},
{
"term":"Mumbai Pune Highway",
"weight":2850},
...