Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/ruby-on-rails/60.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
优化SQL连接调用_Sql_Ruby On Rails_Postgresql_Heroku Postgres - Fatal编程技术网

优化SQL连接调用

优化SQL连接调用,sql,ruby-on-rails,postgresql,heroku-postgres,Sql,Ruby On Rails,Postgresql,Heroku Postgres,我有一个SQL查询执行得很差。我对连接做了一些研究,观看了教程,确保定义了正确的索引,等等,但老实说,对于如何提高这个所谓的查询的性能,我有点不知所措 我有以下模式定义: create_table "training_plans", :force => true do |t| t.integer "user_id" end add_index "training_plans", ["user_id"], :name => "index_training_plans_on_us

我有一个SQL查询执行得很差。我对连接做了一些研究,观看了教程,确保定义了正确的索引,等等,但老实说,对于如何提高这个所谓的查询的性能,我有点不知所措

我有以下模式定义:

create_table "training_plans", :force => true do |t|
  t.integer  "user_id"
end

add_index "training_plans", ["user_id"], :name => "index_training_plans_on_user_id"

create_table "training_weeks", :force => true do |t|
  t.integer  "training_plan_id"
  t.date     "start_date"
end

add_index "training_weeks", ["training_plan_id", "start_date"], :name => "index_training_weeks_on_training_plan_id_and_start_date"
add_index "training_weeks", ["training_plan_id"], :name => "index_training_weeks_on_training_plan_id"

create_table "training_efforts", :force => true do |t|
  t.string   "name"
  t.date     "plandate"
  t.integer  "training_week_id"
end

add_index "training_efforts", ["plandate"], :name => "index_training_efforts_on_plandate"
add_index "training_efforts", ["training_week_id", "plandate"], :name => "index_training_efforts_on_training_week_id_and_plandate"
add_index "training_efforts", ["training_week_id"], :name => "index_training_efforts_on_training_week_id"
然后调用以下命令,以收集与特定培训计划相关的所有培训工作,包括所有相关的骑乘对象,其中培训工作计划日期在目标日期范围内,并按计划结果排序

    tefts = self.training_efforts.includes(:rides).order("plandate ASC").where("plandate >= ? AND plandate <= ?",
                                                      beginning_date,
                                                      end_date)
这将生成以下查询输出:

TrainingEffort Load (3393.6ms)  SELECT "training_efforts".* FROM "training_efforts" 
  INNER JOIN "training_weeks" ON "training_efforts"."training_week_id" = "training_weeks"."id" 
  WHERE "training_weeks"."training_plan_id" = 104 
  AND (plandate >= '2015-01-05' AND plandate <= '2016-01-03') ORDER BY plandate ASC
我相信我已经定义了正确的索引。这张桌子没有那么大。然而,这需要花费大量的时间。作为进一步的背景,这是关于Heroku Postgres的。最后,我要提到的是,在我的开发系统上,查询速度比大多数3.3ms慢,但仍然不比平均速度慢1000倍

提前感谢您对优化此查询的任何帮助

更新 以下是在my dev系统上发出的查询的解释输出:

explain SELECT "training_efforts".* FROM "training_efforts" INNER JOIN "training_weeks" 
  ON "training_efforts"."training_week_id" = "training_weeks"."id" 
  WHERE "training_weeks"."training_plan_id" = 7 
  AND (plandate >= '2015-01-05' AND plandate <= '2016-01-03') ORDER BY plandate ASC;
                                          QUERY PLAN                                           
-----------------------------------------------------------------------------------------------
 Sort  (cost=430.52..432.04 rows=606 width=120)
   Sort Key: training_efforts.plandate
   ->  Hash Join  (cost=15.12..402.51 rows=606 width=120)
         Hash Cond: (training_efforts.training_week_id = training_weeks.id)
         ->  Seq Scan on training_efforts  (cost=0.00..377.25 rows=1089 width=120)
               Filter: ((plandate >= '2015-01-05'::date) AND (plandate <= '2016-01-03'::date))
         ->  Hash  (cost=11.86..11.86 rows=261 width=4)
               ->  Seq Scan on training_weeks  (cost=0.00..11.86 rows=261 width=4)
                     Filter: (training_plan_id = 7) 
更新2 尝试另一个查询以查看是否将使用我的索引,并注意到与都有日期列的training_weeks相比,training_努力的次数是training_weeks的7倍,我将尝试搜索training_weeks日期,而不是training_努力的日期,如下所示:

explain SELECT "training_efforts".* FROM "training_efforts" INNER JOIN "training_weeks" 
  ON "training_weeks"."id" = "training_efforts"."training_week_id" 
  WHERE "training_weeks"."id" IN (SELECT "training_weeks"."id" FROM "training_weeks" 
  WHERE "training_weeks"."training_plan_id" = 7 AND (start_date >= '2015-01-05' AND start_date <= '2016-01-03')) 
  ORDER BY plandate ASC;
                                                                     QUERY PLAN                                                                     
----------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=376.83..378.34 rows=602 width=120)
   Sort Key: training_efforts.plandate
   ->  Nested Loop  (cost=14.23..349.04 rows=602 width=120)
         ->  Hash Semi Join  (cost=13.95..26.83 rows=86 width=8)
               Hash Cond: (training_weeks.id = training_weeks_1.id)
               ->  Seq Scan on training_weeks  (cost=0.00..10.69 rows=469 width=4)
               ->  Hash  (cost=12.87..12.87 rows=86 width=4)
                     ->  Bitmap Heap Scan on training_weeks training_weeks_1  (cost=5.37..12.87 rows=86 width=4)
                           Recheck Cond: ((training_plan_id = 7) AND (start_date >= '2015-01-05'::date) AND (start_date <= '2016-01-03'::date))
                           ->  Bitmap Index Scan on index_training_weeks_on_training_plan_id_and_start_date  (cost=0.00..5.35 rows=86 width=0)
                                 Index Cond: ((training_plan_id = 7) AND (start_date >= '2015-01-05'::date) AND (start_date <= '2016-01-03'::date))
         ->  Index Scan using index_training_efforts_on_training_week_id on training_efforts  (cost=0.28..3.68 rows=7 width=120)
               Index Cond: (training_week_id = training_weeks.id)

这看起来稍微好一点,但我仍然不相信这是优化的…

每个表中有多少行?这些表是最近重新创建的还是旧的?你最近分析过这些表格吗?看起来它正在进行顺序扫描,而没有使用任何索引

我会发一张支票

vacuum analyze

在整个数据库上,或者至少在这两个表上。很多时候,如果表中没有正确的统计信息,优化器会跳过索引。

看起来您实际上没有使用连接的输出,因此我建议完全放弃它,看看这是否会提高性能

我建议使用原始查询,您应该能够使用SQL和参数调用ActiveRecord对象的connection.execute方法,替换?对于需要由SQL库插值的参数,即变量,然后将这些参数作为列表作为第二个参数传递给方法

对于原始SQL,我建议尝试以下方法,根据需要替换占位符和参数以替换任何可能变化的参数。我怀疑这将有更好的表现

SELECT te.*
FROM training_efforts AS te
WHERE EXISTS (SELECT 1
              FROM training_weeks AS tw
              WHERE tw.training_week_id = te.training_week_id
                AND tw.training_plan_id = 7
                AND start_date >= '2015-01-05' AND start_date <= '2016-01-03'
            )
ORDER BY plandate ASC

在将其转换为ActiveRecord查询方面,我不确定它是否提供了相当高的控制级别—最好将其保留为原始查询。

我同意您对索引的看法……为什么会这样?我将尝试另一种查询格式,看看它是否使用索引……这三个表有数千行5-30k。他们已经存在好几个月了。刚才分析报告说他们在过去的两天里被自动清空了。分析后执行计划或速度有变化吗?进行真空分析非常重要,因为优化器会根据数据的统计信息进行优化。如果它认为您的数据非常小,或者您将查询很大一部分数据,它将完全跳过索引,因为在这些情况下它们可能效率低下。谢谢joe和@khampson。我将此标记为答案,因为它最接近解决问题。我需要等几天才能看到日志,我对结果很满意。基本上,我将查询更改为“tefts=trainingefforce.includes:rides.orderplandate ASC.joins:training\u week.where:training\u weeks=>{:id=>self.training\u weeks.where start\u date>=”和start\u date