Ruby on rails Rails使用activerecord查询限制每组

Ruby on rails Rails使用activerecord查询限制每组,ruby-on-rails,ruby,postgresql,activerecord,group-by,Ruby On Rails,Ruby,Postgresql,Activerecord,Group By,我的问题与此有点类似- 我想限制每个站点的特定计数(100)。 限制可以是100或4 为了简单起见,我认为它是4 Session.website_only.during(date_range) .count(group: [:site_id, :referrer_host], order: 'count_all DESC', limit: 4) 生成的SQL查询如下所示 SELECT COUNT(*) AS count_al

我的问题与此有点类似-

我想限制每个站点的特定计数(100)。

限制可以是100或4

为了简单起见,我认为它是4

Session.website_only.during(date_range)
       .count(group: [:site_id, :referrer_host],
              order: 'count_all DESC',
              limit: 4)
生成的SQL查询如下所示

SELECT COUNT(*) AS count_all, site_id AS site_id,
referrer_host AS referrer_host FROM "sessions"
WHERE "sessions"."created_at" >= '2013-12-09 00:00:00.000000'
AND "sessions"."created_at" <= '2013-12-16 23:59:59.999999' AND
(referrer_host IS NOT NULL)
AND (("sessions"."referrer_host" NOT ILIKE '%google.com%'
AND "sessions"."referrer_host" NOT ILIKE '%yahoo.com%'
AND "sessions"."referrer_host" NOT ILIKE '%bing.com%' 
AND "sessions"."referrer_host" NOT ILIKE '%aol.com%')) 
AND (("sessions"."referrer_host" NOT ILIKE '%twitter.com%' 
AND "sessions"."referrer_host" NOT ILIKE '%facebook.com%' 
AND "sessions"."referrer_host" NOT ILIKE '%linkedin.com%' 
AND "sessions"."referrer_host" NOT ILIKE '%fb.me%')) 
GROUP BY "sessions"."site_id", "sessions"."referrer_host" 
ORDER BY count_all DESC LIMIT 4
我想要什么


我希望将每个组中的集合数量限制为4个,而不是每个组中的任意数量。

我不认为有一种方法可以在数据库中执行此操作,而不计算所有行的值,然后进行过滤。因此,在这种情况下,我宁愿用ruby对其进行过滤,这样可以使代码更清晰,便于阅读。大概是这样的:

data = {[1, "https"]=>8769, [1, "www.example.com"]=>2359, [1, "www.xyz.com"]=>1935, [1, "www.bayers.com"]=>379, 
[2, "www.ruby.com"]=>1322, [2, "www.employment.com"]=>472, [2, "https"]=>424, 
[3, "www.rails.com"]=>424, [3, "www.arizona.net"]=>392, [3, "www.murphy.com"]=>390, 
[4, "www.associates.com"]=>374, [4, "www.reddit.com"]=>365, [4, "www.razorshape.com"]=>352, 
[5, "www.rediff.com"]=>337, [5, "www.tumbleweed.com"]=>327, [5, "www.arizona.com"]=>289, 
[6, "https"]=>275, [131, "www.example.com"]=>253, [6, "www.murphy.com"]=>236, [6, "www.associates.com"]=>227}

limit = 4  # or 100
#Create a hash that has arrays on the keys
counts = Hash.new(0)
result = Hash.new

data.each do |k, v|
  site = k[0]
  if counts[site] < limit
    counts[site]+=1
    result[k]=v
  end
end

puts counts
puts result
data={[1,“https”]=>8769[1,“www.example.com”]=>2359[1,“www.xyz.com”]=>1935[1,“www.bayes.com”]=>379,
[2,“www.ruby.com”]=>1322[2,“www.employment.com”]=>472[2,“https”]=>424,
[3,“www.rails.com”]=>424,[3,“www.arizona.net”]=>392,[3,“www.murphy.com”]=>390,
[4,“www.associates.com”]=>374,[4,“www.reddit.com”]=>365,[4,“www.razorshape.com”]=>352,
[5,“www.rediff.com”]=>337,[5,“www.tumbleweed.com”]=>327,[5,“www.arizona.com”]=>289,
[6,“https”]=>275、[131,“www.example.com”]=>253、[6,“www.murphy.com”]=>236、[6,“www.associates.com”]=>227}
限值=4#或100
#创建在键上具有数组的哈希
计数=散列。新建(0)
结果=Hash.new
数据。每个do | k,v|
site=k[0]
如果计数[站点]<限制
计数[站点]+=1
结果[k]=v
结束
结束
计算
结果

计数
结构的最终格式与
数据
结构的格式不完全相同,但可以很容易地转换回来。正在运行的代码可以在

中找到,那么,这不是您所需要的吗?你得到了什么?你需要什么?我想把每个网站的每个组限制在一个特定的数量(100)。当前的结果限制了完整的结果集。所以,您只想要结果超过100个的组?或者,您想对结果做什么?不,我想将每个站点返回的集合数限制为4个。您想每个
站点\u id
最多4个结果吗?谢谢您的回答。如果整套设备增加到50K,速度会减慢吗?。正如它现在所做的,因为我正在处理的是:)是的,它会很慢。在普通sql中执行此操作可能也会很慢,因为查询无法计算这种“复合限制”。因此,要在SQL中执行此操作,您需要首先计算它,然后通过将计算表与其自身连接并删除必要的记录来进行过滤。一般来说,如果你要做大数据,有比简单的sql查询更好的方法。同意。我将寻找查询。
data = {[1, "https"]=>8769, [1, "www.example.com"]=>2359, [1, "www.xyz.com"]=>1935, [1, "www.bayers.com"]=>379, 
[2, "www.ruby.com"]=>1322, [2, "www.employment.com"]=>472, [2, "https"]=>424, 
[3, "www.rails.com"]=>424, [3, "www.arizona.net"]=>392, [3, "www.murphy.com"]=>390, 
[4, "www.associates.com"]=>374, [4, "www.reddit.com"]=>365, [4, "www.razorshape.com"]=>352, 
[5, "www.rediff.com"]=>337, [5, "www.tumbleweed.com"]=>327, [5, "www.arizona.com"]=>289, 
[6, "https"]=>275, [131, "www.example.com"]=>253, [6, "www.murphy.com"]=>236, [6, "www.associates.com"]=>227}

limit = 4  # or 100
#Create a hash that has arrays on the keys
counts = Hash.new(0)
result = Hash.new

data.each do |k, v|
  site = k[0]
  if counts[site] < limit
    counts[site]+=1
    result[k]=v
  end
end

puts counts
puts result