Sorting Solr刻面等价于group by?
如果我有这样的数据:Sorting Solr刻面等价于group by?,sorting,solr,Sorting,Solr,如果我有这样的数据: {"field1":"x", "field2":".."} {"field1":"x", "field2":".."} {"field1":"y", "field2":".."} {"field1":"y", "field2":".."} {"field1":"y", "field2":".."} { "responseHeader":{..} "grouped":{ "field1": { "matches": 5,
{"field1":"x", "field2":".."}
{"field1":"x", "field2":".."}
{"field1":"y", "field2":".."}
{"field1":"y", "field2":".."}
{"field1":"y", "field2":".."}
{
"responseHeader":{..}
"grouped":{
"field1": {
"matches": 5,
"groups": [
{"groupValue": "x", "doclist":{"numFound": 2, ...}}
{"groupValue": "y", "doclist":{"numFound": 3, ...}}
]
}
}
}
使用简单的group=true&group.field=field1&group.limit=0
我得到如下结果:
{"field1":"x", "field2":".."}
{"field1":"x", "field2":".."}
{"field1":"y", "field2":".."}
{"field1":"y", "field2":".."}
{"field1":"y", "field2":".."}
{
"responseHeader":{..}
"grouped":{
"field1": {
"matches": 5,
"groups": [
{"groupValue": "x", "doclist":{"numFound": 2, ...}}
{"groupValue": "y", "doclist":{"numFound": 3, ...}}
]
}
}
}
使用此函数,我知道为每个groupValue
(numFound
)找到的文档数。问题是我需要按降序对结果组进行排序,这两种排序都不可能(简单的sort=numFound
将导致异常,即字段numFound
不存在,而group.sort将对每个组内的文档进行排序)
使用facets是否有类似的方法,我可以通过计数对结果进行排序?您可以尝试:
http://localhost:8983/solr/your_core/select?facet.field=field1&facet.sort=count&facet.limit=-1&facet=on&indent=on&q=*:*&rows=0&start=0&wt=json
结果将类似于:
{
"responseHeader":{
"status":0,
"QTime":17,
"params":{
"q":"*:*",
"facet.field":"field1",
"indent":"on",
"start":"0",
"rows":"0",
"facet":"on",
"wt":"json"}},
"response":{"numFound":225364,"start":0,"docs":[]
},
"facet_counts":{
"facet_queries":{},
"facet_fields":{
"field1":[
"x",113550,
"y",111814]},
"facet_ranges":{},
"facet_intervals":{},
"facet_heatmaps":{}
}
}
{
"responseHeader":{
"status":0,
"QTime":614,
"params":{
"facet.limit":"10",
"q":"*:*",
"facet.field":"field1",
"indent":"on",
"stats":"true",
"start":"0",
"rows":"0",
"facet":"true",
"wt":"json",
"facet.sort":"count",
"stats.field":"{!cardinality=true}field1"}},
"response":{"numFound":2336315,"start":0,"docs":[]
},
"facet_counts":{
"facet_queries":{},
"facet_fields":{
"field1":[
"Value1",708116,
"Value2",607088,
"Value3",493949,
"Value4",314433,
"Value5",104478,
"Value6",41099,
"Value7",28879,
"Value8",18767,
"Value9",9308,
"Value10",4545]},
"facet_ranges":{},
"facet_intervals":{},
"facet_heatmaps":{}},
"stats":{
"stats_fields":{
"field1":{
"cardinality":27}}}}
刚刚用Solr 6.3.0进行了测试
有关更多信息,请查看中的相关零件
如果要同时计算可用面的数量,可以使用Solrstats
Component(因为字段的类型为numeric、string或date)。但请记住,可能会出现服务器性能和内存开销问题 运行如下查询:
http://localhost:8983/solr/your_core/select?facet.field=field1&facet.sort=count&facet.limit=10&facet=true&indent=on&q=*:*&rows=0&start=0&wt=json&stats=true&stats.field={!cardinality=true}field1
答案是这样的:
{
"responseHeader":{
"status":0,
"QTime":17,
"params":{
"q":"*:*",
"facet.field":"field1",
"indent":"on",
"start":"0",
"rows":"0",
"facet":"on",
"wt":"json"}},
"response":{"numFound":225364,"start":0,"docs":[]
},
"facet_counts":{
"facet_queries":{},
"facet_fields":{
"field1":[
"x",113550,
"y",111814]},
"facet_ranges":{},
"facet_intervals":{},
"facet_heatmaps":{}
}
}
{
"responseHeader":{
"status":0,
"QTime":614,
"params":{
"facet.limit":"10",
"q":"*:*",
"facet.field":"field1",
"indent":"on",
"stats":"true",
"start":"0",
"rows":"0",
"facet":"true",
"wt":"json",
"facet.sort":"count",
"stats.field":"{!cardinality=true}field1"}},
"response":{"numFound":2336315,"start":0,"docs":[]
},
"facet_counts":{
"facet_queries":{},
"facet_fields":{
"field1":[
"Value1",708116,
"Value2",607088,
"Value3",493949,
"Value4",314433,
"Value5",104478,
"Value6",41099,
"Value7",28879,
"Value8",18767,
"Value9",9308,
"Value10",4545]},
"facet_ranges":{},
"facet_intervals":{},
"facet_heatmaps":{}},
"stats":{
"stats_fields":{
"field1":{
"cardinality":27}}}}
有关
stats
的更多信息,您可以查看。我也尝试过,而且,我事先没有提到,但实际上我有很多结果。我可以使用facet.offset
和facet.limit
翻阅结果,但Solr不会打印组数(就像分组那样)。您知道获得(方面结果的)总计数的方法吗?您想在一个查询中获得所有方面(组)以及相关计数吗?如果是这样,您可以使用facet.limit=-1。如果您想要结果的总计数,您不能只使用response
的numFound
属性吗?@Izagkaretos不幸的是,我需要根据结果创建一个可分页的表。numFound表示找到的文档数,而不是字段合并后创建的组数。好的,faceted方法中缺少的所需信息是Facet的总数?是的,我需要知道Facet的数量以判断哪一页是最后一页,否则我只知道当前页,我甚至不知道是否有下一页,只是在查询之后。