不过,这甚至有点效率。我想到了生成序列的递归查询,但我在postgresql方面还没有足够的技能来编写类似的东西,而且性能也有问题。如果您不局限于一个查询,您可以创建另一个表,然后用与group\u period不同的时间戳对其进行预填充,然后加入,条件是

不过,这甚至有点效率。我想到了生成序列的递归查询,但我在postgresql方面还没有足够的技能来编写类似的东西,而且性能也有问题。如果您不局限于一个查询,您可以创建另一个表,然后用与group\u period不同的时间戳对其进行预填充,然后加入,条件是,sql,postgresql,query-optimization,Sql,Postgresql,Query Optimization,不过,这甚至有点效率。我想到了生成序列的递归查询,但我在postgresql方面还没有足够的技能来编写类似的东西,而且性能也有问题。如果您不局限于一个查询,您可以创建另一个表,然后用与group\u period不同的时间戳对其进行预填充,然后加入,条件是在time\u start和(time\u start+duration)@Timekiller预填充似乎可以完成任务,因为我不受查询数量的限制(我将所有事情都作为事务执行)-我只担心这种方法的性能方面,如果基表有数亿行,这可以接受吗?@Tim


不过,这甚至有点效率。我想到了生成序列的递归查询,但我在postgresql方面还没有足够的技能来编写类似的东西,而且性能也有问题。如果您不局限于一个查询,您可以创建另一个表,然后用与
group\u period
不同的时间戳对其进行预填充,然后加入,条件是
在time\u start和(time\u start+duration)
@Timekiller预填充似乎可以完成任务,因为我不受查询数量的限制(我将所有事情都作为事务执行)-我只担心这种方法的性能方面,如果基表有数亿行,这可以接受吗?@Timekiller我已经听从了你的建议,现在我完全清醒了,看看我的答案,想想看,
period\u start和(period\u start+group\u period)
之间的时间\u start是不够的,因为它不会为长时间运行的任务提供多行。您确实需要重叠间隔,幸运的是它比循环更简单:
time\u start
SELECT
    ROUND(time_start/group_period,0) AS time_period,
    SUM(count_event1) AS sum_event1,
    SUM(count_event2) AS sum_event2 
FROM measurements
GROUP BY time_period;
-- Since there's a problem with declaring variables in PostgreSQL,
-- we will be using aliases for the arguments required by the script.

-- First some configuration:
--   group_period = 3600   -- group by 1 hour (= 3600 seconds)
--   min_time = 1440226301 -- Sat, 22 Aug 2015 06:51:41 GMT
--   max_time = 1450926301 -- Thu, 24 Dec 2015 03:05:01 GMT

-- Calculate the number of started periods in the given interval in advance.
--   period_count = CEIL((max_time - min_time) / group_period)

SET TIME ZONE UTC;
BEGIN TRANSACTION;

-- Create a temporary table and fill it with all time periods.
CREATE TEMP TABLE periods (period_start TIMESTAMP)
    ON COMMIT DROP;
INSERT INTO periods (period_start)
    SELECT to_timestamp(min_time + group_period * coefficient)
    FROM generate_series(0, period_count) as coefficient;

-- Group data by the time periods.
-- Note that we don't require exact overlap of intervals:
--   A. [period_start, period_start + group_period]
--   B. [time_start, time_start + duration]
-- This would yield the best possible result but it would also slow
-- down the query significantly because of the part B.
-- We require only: period_start <= time_start <= period_start + group_period
SELECT
    period_start,
    COUNT(measurements.*) AS count_measurements,
    SUM(count_event1) AS sum_event1,
    SUM(count_event2) AS sum_event2
FROM periods
LEFT JOIN measurements
ON time_start BETWEEN period_start AND (period_start + group_period)
GROUP BY period_start;

COMMIT TRANSACTION;