Json 从集合中删除某些记录后,不会更新Apache Solr suggester字典
我想在上传一些初始数据后更新Solr suggester字典(在Solr 5.3.1中)。删除所有数据并将修改后的数据上载回后,建议词典将更新。但当我删除任何特定记录(例如id=123)时,建议者仍会在结果集中返回已删除的记录 比如说,Json 从集合中删除某些记录后,不会更新Apache Solr suggester字典,json,solr,Json,Solr,我想在上传一些初始数据后更新Solr suggester字典(在Solr 5.3.1中)。删除所有数据并将修改后的数据上载回后,建议词典将更新。但当我删除任何特定记录(例如id=123)时,建议者仍会在结果集中返回已删除的记录 比如说, 我最初将以下json数据上载到mycollection: json_data.json [ { "id": 1, "name": "New York" }, { "id": 2,
[
{
"id": 1,
"name": "New York"
},
{
"id": 2,
"name": "New Jersey"
},
{
"id": 3,
"name": "California"
}
]
使用此命令:
curl -X POST -H 'Content-Type: application/json' 'http://localhost:8983/solr/mycollection/update?commit=true' --data-binary @json_data.json
{
"responseHeader": {
"status": 0,
"QTime": 0,
"params": {
"q": "*:*",
"indent": "true",
"wt": "json",
"_": "1448271522315"
}
},
"response": {
"numFound": 3,
"start": 0,
"docs": [
{
"id": "1",
"name": [
"New York"
],
"_version_": 1518622755181822000
},
{
"id": "2",
"name": [
"New Jersey"
],
"_version_": 1518622755186016300
},
{
"id": "3",
"name": [
"California"
],
"_version_": 1518622755187064800
}
]
}
}
{
"responseHeader": {
"status": 0,
"QTime": 0
},
"spellcheck": {
"suggestions": [
"New",
{
"numFound": 2,
"startOffset": 0,
"endOffset": 3,
"suggestion": [
"New Jersey",
"New York"
]
}
],
"collations": [
"collation",
"(New Jersey)"
]
}
}
{
"responseHeader": {
"status": 0,
"QTime": 0,
"params": {
"q": "*:*",
"indent": "true",
"wt": "json",
"_": "1448272061874"
}
},
"response": {
"numFound": 2,
"start": 0,
"docs": [
{
"id": "1",
"name": [
"New York"
],
"_version_": 1518622755181822000
},
{
"id": "3",
"name": [
"California"
],
"_version_": 1518622755187064800
}
]
}
}
{
"responseHeader": {
"status": 0,
"QTime": 0
},
"spellcheck": {
"suggestions": [
"New",
{
"numFound": 2,
"startOffset": 0,
"endOffset": 3,
"suggestion": [
"New Jersey",
"New York"
]
}
],
"collations": [
"collation",
"(New Jersey)"
]
}
}
{
"responseHeader": {
"status": 0,
"QTime": 0,
"params": {
"q": "*:*",
"indent": "true",
"wt": "json",
"_": "1448271522315"
}
},
"response": {
"numFound": 3,
"start": 0,
"docs": [
{
"id": "1",
"name": [
"New York"
],
"_version_": 1518622755181822000
},
{
"id": "2",
"name": [
"New Jersey"
],
"_version_": 1518622755186016300
},
{
"id": "3",
"name": [
"California"
],
"_version_": 1518622755187064800
}
]
}
}
{
"responseHeader": {
"status": 0,
"QTime": 0
},
"spellcheck": {
"suggestions": [
"New",
{
"numFound": 2,
"startOffset": 0,
"endOffset": 3,
"suggestion": [
"New Jersey",
"New York"
]
}
],
"collations": [
"collation",
"(New Jersey)"
]
}
}
{
"responseHeader": {
"status": 0,
"QTime": 0,
"params": {
"q": "*:*",
"indent": "true",
"wt": "json",
"_": "1448272061874"
}
},
"response": {
"numFound": 2,
"start": 0,
"docs": [
{
"id": "1",
"name": [
"New York"
],
"_version_": 1518622755181822000
},
{
"id": "3",
"name": [
"California"
],
"_version_": 1518622755187064800
}
]
}
}
{
"responseHeader": {
"status": 0,
"QTime": 0
},
"spellcheck": {
"suggestions": [
"New",
{
"numFound": 2,
"startOffset": 0,
"endOffset": 3,
"suggestion": [
"New Jersey",
"New York"
]
}
],
"collations": [
"collation",
"(New Jersey)"
]
}
}
{
"responseHeader": {
"status": 0,
"QTime": 0,
"params": {
"q": "*:*",
"indent": "true",
"wt": "json",
"_": "1448271522315"
}
},
"response": {
"numFound": 3,
"start": 0,
"docs": [
{
"id": "1",
"name": [
"New York"
],
"_version_": 1518622755181822000
},
{
"id": "2",
"name": [
"New Jersey"
],
"_version_": 1518622755186016300
},
{
"id": "3",
"name": [
"California"
],
"_version_": 1518622755187064800
}
]
}
}
{
"responseHeader": {
"status": 0,
"QTime": 0
},
"spellcheck": {
"suggestions": [
"New",
{
"numFound": 2,
"startOffset": 0,
"endOffset": 3,
"suggestion": [
"New Jersey",
"New York"
]
}
],
"collations": [
"collation",
"(New Jersey)"
]
}
}
{
"responseHeader": {
"status": 0,
"QTime": 0,
"params": {
"q": "*:*",
"indent": "true",
"wt": "json",
"_": "1448272061874"
}
},
"response": {
"numFound": 2,
"start": 0,
"docs": [
{
"id": "1",
"name": [
"New York"
],
"_version_": 1518622755181822000
},
{
"id": "3",
"name": [
"California"
],
"_version_": 1518622755187064800
}
]
}
}
{
"responseHeader": {
"status": 0,
"QTime": 0
},
"spellcheck": {
"suggestions": [
"New",
{
"numFound": 2,
"startOffset": 0,
"endOffset": 3,
"suggestion": [
"New Jersey",
"New York"
]
}
],
"collations": [
"collation",
"(New Jersey)"
]
}
}
{
"responseHeader": {
"status": 0,
"QTime": 0,
"params": {
"q": "*:*",
"indent": "true",
"wt": "json",
"_": "1448271522315"
}
},
"response": {
"numFound": 3,
"start": 0,
"docs": [
{
"id": "1",
"name": [
"New York"
],
"_version_": 1518622755181822000
},
{
"id": "2",
"name": [
"New Jersey"
],
"_version_": 1518622755186016300
},
{
"id": "3",
"name": [
"California"
],
"_version_": 1518622755187064800
}
]
}
}
{
"responseHeader": {
"status": 0,
"QTime": 0
},
"spellcheck": {
"suggestions": [
"New",
{
"numFound": 2,
"startOffset": 0,
"endOffset": 3,
"suggestion": [
"New Jersey",
"New York"
]
}
],
"collations": [
"collation",
"(New Jersey)"
]
}
}
{
"responseHeader": {
"status": 0,
"QTime": 0,
"params": {
"q": "*:*",
"indent": "true",
"wt": "json",
"_": "1448272061874"
}
},
"response": {
"numFound": 2,
"start": 0,
"docs": [
{
"id": "1",
"name": [
"New York"
],
"_version_": 1518622755181822000
},
{
"id": "3",
"name": [
"California"
],
"_version_": 1518622755187064800
}
]
}
}
{
"responseHeader": {
"status": 0,
"QTime": 0
},
"spellcheck": {
"suggestions": [
"New",
{
"numFound": 2,
"startOffset": 0,
"endOffset": 3,
"suggestion": [
"New Jersey",
"New York"
]
}
],
"collations": [
"collation",
"(New Jersey)"
]
}
}
我在重新加载管理内核并重新启动solr服务器后尝试过,但结果保持不变
可能是什么问题?suggester组件(solr.SpellCheckComponent)是否使用了缓存?如果是,如何清除
任何帮助都将不胜感激。大多数拼写检查和建议不会直接从索引中进行,而是构建一个并行结构。每次提交时都会生成一个设置,如果该设置为false,则可以在请求中传递一个标志以触发重建。对于拼写检查器,此标志为
因此,请检查您的配置,并尝试显式调用rebuild/reload。大多数拼写检查程序和建议程序不会直接从索引中执行,而是构建一个并行结构。每次提交时都会生成一个设置,如果该设置为false,则可以在请求中传递一个标志以触发重建。对于拼写检查器,此标志为
因此,请检查您的配置,并尝试显式调用rebuild/reload。感谢Alexandre的回复。我的solrconfig.xml具有true设置。我在建议请求中尝试了spellcheck.build和spellcheck.reload两个标志。但是他们都没有更新mycollection中的suggester字典。我总是得到被删除的记录“新泽西”作为回应。只有当我删除所有记录并上传剩下的2条记录时,我才能更新suggester字典。这听起来好像拼写检查器忽略了deleted标志。在合并段之前(例如,使用-不推荐-优化调用,或通过达到合并阈值),或者如果删除段中的所有记录并且可以删除整个段,则不会真正删除记录。做一个实验。添加另外两条记录,提交。添加测试记录,提交。检查它的拼写是否正确。删除测试记录并提交。如果它从拼写中消失,那么我的假设是正确的。不管是哪种方式,这可能值得带到Solr用户邮件列表。是的,Alexandre,你的假设是正确的。我添加了两个新记录,然后添加了一个测试记录。然后我只删除了考试记录,它从拼写中消失了。这是令人惊讶的。“已删除”而不是“合并/清除”记录是Lucene实现的深层次部分,有时会影响系统中不太明显的部分。我已经将评论添加到文档页面,以便对此进行审查,如果正确,请提及:感谢Alexandre的回复。我的solrconfig.xml具有true设置。我在建议请求中尝试了spellcheck.build和spellcheck.reload两个标志。但是他们都没有更新mycollection中的suggester字典。我总是得到被删除的记录“新泽西”作为回应。只有当我删除所有记录并上传剩下的2条记录时,我才能更新suggester字典。这听起来好像拼写检查器忽略了deleted标志。在合并段之前(例如,使用-不推荐-优化调用,或通过达到合并阈值),或者如果删除段中的所有记录并且可以删除整个段,则不会真正删除记录。做一个实验。添加另外两条记录,提交。添加测试记录,提交。检查它的拼写是否正确。删除测试记录并提交。如果它从拼写中消失,那么我的假设是正确的。不管是哪种方式,这可能值得带到Solr用户邮件列表。是的,Alexandre,你的假设是正确的。我添加了两个新记录,然后添加了一个测试记录。然后我只删除了考试记录,它从拼写中消失了。这是令人惊讶的。“已删除”而不是“合并/清除”记录是Lucene实现的深层次部分,有时会影响系统中不太明显的部分。我已将评论添加到文档页面,以便对此进行审查,如果正确,请提及: