Postgresql 性能差异日期（'；天'；，<；时间戳>；）与<；时间戳>；：：日期_Postgresql_Query Planner

Postgresql 性能差异日期（'；天'；，<；时间戳>；）与<；时间戳>；：：日期

postgresql

Postgresql 性能差异日期（'；天'；，<；时间戳>；）与<；时间戳>；：：日期,postgresql,query-planner,Postgresql,Query Planner,如果我想按时间戳发生的日期对时间戳列进行分组，比如registed\u at，我可以使用date\u trunc（'day'，registed\u at）或registed\u at:：date。前者从时间戳中删除小时和更小的单位，但仍然返回时间戳，而后者返回日期转换的时间戳。现在，我想知道这两个中是否有一个比另一个性能更好当我查看查询计划时，理论成本是完全相同的，并且在实际执行时间中可能会有很多噪音 -- SELECT date_trunc('day', registered_at) Seq

如果我想按时间戳发生的日期对时间戳列进行分组，比如

registed\u at

，我可以使用

date\u trunc（'day'，registed\u at）

或

registed\u at:：date

。前者从时间戳中删除小时和更小的单位，但仍然返回时间戳，而后者返回日期转换的时间戳。现在，我想知道这两个中是否有一个比另一个性能更好

当我查看查询计划时，理论成本是完全相同的，并且在实际执行时间中可能会有很多噪音

-- SELECT date_trunc('day', registered_at)
Seq Scan on customers  (cost=0.00..5406.45 rows=23987 width=8) (actual time=0.023..46.811 rows=24436 loops=1)
  Filter: (created_at > '2019-06-01 00:00:00'::timestamp without time zone)
  Rows Removed by Filter: 113958
Planning time: 0.107 ms
Execution time: 48.158 ms

-- SELECT registered_at::date
Seq Scan on customers  (cost=0.00..5406.45 rows=23987 width=4) (actual time=0.017..34.353 rows=24436 loops=1)
  Filter: (created_at > '2019-06-01 00:00:00'::timestamp without time zone)
  Rows Removed by Filter: 113958
Planning time: 0.121 ms
Execution time: 35.548 ms

是否有人知道哪种方法更快，无论是在截断时还是随后使用

group by

？

如果多次运行测试，结果会有所不同，则可能是缓存问题

此外，查询执行时间中始终存在som“随机噪声”

-- SELECT date_trunc('day', registered_at)
Seq Scan on customers  (cost=0.00..5406.45 rows=23987 width=8) (actual time=0.023..46.811 rows=24436 loops=1)
  Filter: (created_at > '2019-06-01 00:00:00'::timestamp without time zone)
  Rows Removed by Filter: 113958
Planning time: 0.107 ms
Execution time: 48.158 ms

-- SELECT registered_at::date
Seq Scan on customers  (cost=0.00..5406.45 rows=23987 width=4) (actual time=0.017..34.353 rows=24436 loops=1)
  Filter: (created_at > '2019-06-01 00:00:00'::timestamp without time zone)
  Rows Removed by Filter: 113958
Planning time: 0.121 ms
Execution time: 35.548 ms

如果重复多次实验，两种情况下得到的结果在统计上有显著差异，并且查询相同，只是每个结果行调用了不同的函数，那么差异必须是函数执行时间。

Run

explain（analyze，buffers）

而且你很可能会看到第二次执行只是从缓存中获得了更多的块。我在两个缓存上都找到了

缓冲区：shared hit=4910

，但如果你的意思是这可以解释执行时间的差异，我不会考虑这些，因为它们可能太过嘈杂，无法从检索到的缓冲区数量中确定任何内容，因此实际上这是一个很好的调优指标。查询所需的缓冲区越少，其扩展性越好。这些缓冲区是否从缓存中检索并不重要。因此，在评估不同解决方案的性能时，一定要注意这些问题。