Solr Moreliketh此查询匹配换行符

Solr Moreliketh此查询匹配换行符,solr,morelikethis,Solr,Morelikethis,我有一个更像这个查询的Solr,它产生了一些毫无关联的结果。当我查看查询的调试时,我可以看到查询在换行符上匹配 问题是: mlt?q=is_lesson\u id:49029&start=0&rows=3&fl=*,score&wt=json&fq={!tag=sites}sm_sitename:(FCM或BCM或CCM)&mlt.interestingTerms=details&mlt.match.include=false&mlt.match.offset=0&mlt.fl=title,bo

我有一个更像这个查询的Solr,它产生了一些毫无关联的结果。当我查看查询的调试时,我可以看到查询在换行符上匹配

问题是:

mlt?q=is_lesson\u id:49029&start=0&rows=3&fl=*,score&wt=json&fq={!tag=sites}sm_sitename:(FCM或BCM或CCM)&mlt.interestingTerms=details&mlt.match.include=false&mlt.match.offset=0&mlt.fl=title,body&mlt.mintf=2&mlt.mindf=1&mlt.minwl=4&mlt.boost=true&mlt.qf=title^1000 body&indent=on&debugQuery=on

下面是解释:

 "interestingTerms":[
    "body:rabbit",1.0,
    "body:bunni",0.8582874,
    "body:easter",0.7999738,
    "body: ",0.5719101,
    "body:ampampnbsp",0.51804715,
    "body:nbsp",0.36014518],
 "debug":{
    "rawquerystring":"is_lesson_id:49029",
    "querystring":"is_lesson_id:49029",
    "parsedquery":"body:rabbit body:bunni^0.8582874 
                   body:easter^0.7999738              
                   body: ^0.5719101 
                   body:ampampnbsp^0.51804715 
                   body:nbsp^0.36014518",
    "parsedquery_toString":"body:rabbit 
                            body:bunni^0.8582874 
                            body:easter^0.7999738 
                            body: ^0.5719101 
                            body:ampampnbsp^0.51804715 
                            body:nbsp^0.36014518",
    "explain":{
"p5zqzz/node/681":"\n0.14956066 = (MATCH) product of:\n  0.44868195 = (MATCH) sum of:\n    0.20911716 = (MATCH) weight(body:bunni^0.8582874 in 327), product of:\n      0.5523649 = queryWeight(body:bunni^0.8582874), product of:\n        0.8582874 = boost\n        6.9227004 = idf(docFreq=116, maxDocs=43690)\n        0.09296464 = queryNorm\n      0.3785852 = (MATCH) fieldWeight(body:bunni in 327), product of:\n        1.0 = tf(termFreq(body:bunni)=1)\n        6.9227004 = idf(docFreq=116, maxDocs=43690)\n        0.0546875 = fieldNorm(field=body, doc=327)\n    0.2395648 = (MATCH) weight(body:easter^0.7999738 in 327), product of:\n      0.4799619 = queryWeight(body:easter^0.7999738), product of:\n        0.7999738 = boost\n        6.453766 = idf(docFreq=186, maxDocs=43690)\n        0.09296464 = queryNorm\n      0.49913296 = (MATCH) fieldWeight(body:easter in 327), product of:\n        1.4142135 = tf(termFreq(body:easter)=2)\n        6.453766 = idf(docFreq=186, maxDocs=43690)\n        0.0546875 = fieldNorm(field=body, doc=327)\n  0.33333334 = coord(2/6)\n",
"p5zqzz/node/621":"\n0.14027193 = (MATCH) product of:\n  0.42081577 = (MATCH) sum of:\n    0.21124022 = (MATCH) weight(body:bunni^0.8582874 in 328), product of:\n      0.5523649 = queryWeight(body:bunni^0.8582874), product of:\n        0.8582874 = boost\n        6.9227004 = idf(docFreq=116, maxDocs=43690)\n        0.09296464 = queryNorm\n      0.38242877 = (MATCH) fieldWeight(body:bunni in 328), product of:\n        1.4142135 = tf(termFreq(body:bunni)=2)\n        6.9227004 = idf(docFreq=116, maxDocs=43690)\n        0.0390625 = fieldNorm(field=body, doc=328)\n    0.20957555 = (MATCH) weight(body:easter^0.7999738 in 328), product of:\n      0.4799619 = queryWeight(body:easter^0.7999738), product of:\n        0.7999738 = boost\n        6.453766 = idf(docFreq=186, maxDocs=43690)\n        0.09296464 = queryNorm\n      0.4366504 = (MATCH) fieldWeight(body:easter in 328), product of:\n        1.7320508 = tf(termFreq(body:easter)=3)\n        6.453766 = idf(docFreq=186, maxDocs=43690)\n        0.0390625 = fieldNorm(field=body, doc=328)\n  0.33333334 = coord(2/6)\n",
"p5zqzz/node/1204":"\n0.10955032 = (MATCH) product of:\n  0.32865095 = (MATCH) sum of:\n    0.10455858 = (MATCH) weight(body:bunni^0.8582874 in 432), product of:\n      0.5523649 = queryWeight(body:bunni^0.8582874), product of:\n        0.8582874 = boost\n        6.9227004 = idf(docFreq=116, maxDocs=43690)\n        0.09296464 = queryNorm\n      0.1892926 = (MATCH) fieldWeight(body:bunni in 432), product of:\n        1.0 = tf(termFreq(body:bunni)=1)\n        6.9227004 = idf(docFreq=116, maxDocs=43690)\n        0.02734375 = fieldNorm(field=body, doc=432)\n    0.22409238 = (MATCH) weight(body:easter^0.7999738 in 432), product of:\n      0.4799619 = queryWeight(body:easter^0.7999738), product of:\n        0.7999738 = boost\n        6.453766 = idf(docFreq=186, maxDocs=43690)\n        0.09296464 = queryNorm\n      0.46689618 = (MATCH) fieldWeight(body:easter in 432), product of:\n        2.6457512 = tf(termFreq(body:easter)=7)\n        6.453766 = idf(docFreq=186, maxDocs=43690)\n        0.02734375 = fieldNorm(field=body, doc=432)\n  0.33333334 = coord(2/6)\n"},
    "filter_queries":["{!tag=sites}sm_sitename:(FCM OR BCM OR CCM)"],
    "parsed_filter_queries":["sm_sitename:FCM sm_sitename:BCM sm_sitename:CCM"]}}

这是否表示服务器配置错误,或者内容索引不正确,或者查询是否需要更改

您正在为HTML编制索引吗?您可能希望在筛选管道的开头将HTML标记从文本中剥离出来。有关更多信息,请参阅本页上的HtmlStripCharFilter:

我可以将其用于查询分析器上的文本字段吗?查询解析器不使用标记化器或字符过滤器,因此您需要在将其发送给Solr之前去掉格式。如果它们只是实体,比如
,那么您的编程语言中可能有一个可用的unescape库函数。非常感谢您的帮助!最后,我在查询和索引分析器上添加了另一个带有HTMLStripCharFilterFactory的文本字段,这样我就可以为原始的未扫描html编制索引。然后,我的搜索查询在MLT查询中使用这个字段。Solr还有一个提取请求处理程序,可以将HTML(或大多数富文档)简化为文本。