在一个时间段内以相等的间隔对分钟和最大值执行慢速PostgreSQL查询
我有一个系统,有很多测量设备。这些测量值存储在表“样本数据”中。 每台设备一年可能有1000万次测量。大多数情况下,用户只对某段时间内相同间隔内的100 min-max对感兴趣,例如在过去24小时或过去53周内。为了得到这100分钟和最大值,周期被分成100个相等的间隔。从每个间隔中提取最小值和最大值。您会推荐最有效的数据查询方法吗?到目前为止,我尝试了以下查询:在一个时间段内以相等的间隔对分钟和最大值执行慢速PostgreSQL查询,sql,postgresql,Sql,Postgresql,我有一个系统,有很多测量设备。这些测量值存储在表“样本数据”中。 每台设备一年可能有1000万次测量。大多数情况下,用户只对某段时间内相同间隔内的100 min-max对感兴趣,例如在过去24小时或过去53周内。为了得到这100分钟和最大值,周期被分成100个相等的间隔。从每个间隔中提取最小值和最大值。您会推荐最有效的数据查询方法吗?到目前为止,我尝试了以下查询: WITH periods AS ( SELECT time.start AS st, time.start + (interva
WITH periods AS (
SELECT time.start AS st, time.start + (interval '1 year' / 100) AS en
FROM generate_series(now() - interval '1 year', now(), interval '1 year' / 100) AS time(start)
)
SELECT s.* FROM sample_data s
JOIN periods ON s.time BETWEEN periods.st AND periods.en
JOIN devices d ON d.customer_id = 23
WHERE
s.id = (SELECT id FROM sample_data WHERE device_id = d.id and time BETWEEN periods.st AND periods.en ORDER BY sample ASC LIMIT 1) OR
s.id = (SELECT id FROM sample_data WHERE device_id = d.id and time BETWEEN periods.st AND periods.en ORDER BY sample DESC LIMIT 1)
Column | Type | Modifiers
--------------------+-----------------------------+------------------------------------------------------
id | integer | not null default nextval('devices_id_seq'::regclass)
customer_id | integer |
<Other fields skipped as they are not involved into the query>
Indexes:
"devices_pkey" PRIMARY KEY, btree (id)
"index_devices_on_iccid" UNIQUE, btree (iccid)
这个查询花了大约4秒。它不是很合适,因为每个设备的样本_数据表最多可以包含10万行。
我看到它不是以非常优化的方式运行,但不知道为什么。我以为我已经为这个查询中使用的所有关键字段编制了索引
Column | Type | Modifiers
--------------------+-----------------------------+------------------------------------------------------
id | integer | not null default nextval('devices_id_seq'::regclass)
customer_id | integer |
<Other fields skipped as they are not involved into the query>
Indexes:
"devices_pkey" PRIMARY KEY, btree (id)
"index_devices_on_iccid" UNIQUE, btree (iccid)
你能给我推荐一种更快获取此类统计数据的方法吗
Column | Type | Modifiers
--------------------+-----------------------------+------------------------------------------------------
id | integer | not null default nextval('devices_id_seq'::regclass)
customer_id | integer |
<Other fields skipped as they are not involved into the query>
Indexes:
"devices_pkey" PRIMARY KEY, btree (id)
"index_devices_on_iccid" UNIQUE, btree (iccid)
表“设备”:
Column | Type | Modifiers
--------------------+-----------------------------+------------------------------------------------------
id | integer | not null default nextval('devices_id_seq'::regclass)
customer_id | integer |
<Other fields skipped as they are not involved into the query>
Indexes:
"devices_pkey" PRIMARY KEY, btree (id)
"index_devices_on_iccid" UNIQUE, btree (iccid)
它大约有170万行。属于客户的每4台设备约720K行\u id=23。
该表现在由测试数据填充
Column | Type | Modifiers
--------------------+-----------------------------+------------------------------------------------------
id | integer | not null default nextval('devices_id_seq'::regclass)
customer_id | integer |
<Other fields skipped as they are not involved into the query>
Indexes:
"devices_pkey" PRIMARY KEY, btree (id)
"index_devices_on_iccid" UNIQUE, btree (iccid)
“选择版本()”结果:
Column | Type | Modifiers
--------------------+-----------------------------+------------------------------------------------------
id | integer | not null default nextval('devices_id_seq'::regclass)
customer_id | integer |
<Other fields skipped as they are not involved into the query>
Indexes:
"devices_pkey" PRIMARY KEY, btree (id)
"index_devices_on_iccid" UNIQUE, btree (iccid)
轨道io定时设置为“开”
Column | Type | Modifiers
--------------------+-----------------------------+------------------------------------------------------
id | integer | not null default nextval('devices_id_seq'::regclass)
customer_id | integer |
<Other fields skipped as they are not involved into the query>
Indexes:
"devices_pkey" PRIMARY KEY, btree (id)
"index_devices_on_iccid" UNIQUE, btree (iccid)
在此处解释(分析、缓冲)结果:
我猜性能的驱动因素是
where
子句中的查询。让我们看看其中一个:
Column | Type | Modifiers
--------------------+-----------------------------+------------------------------------------------------
id | integer | not null default nextval('devices_id_seq'::regclass)
customer_id | integer |
<Other fields skipped as they are not involved into the query>
Indexes:
"devices_pkey" PRIMARY KEY, btree (id)
"index_devices_on_iccid" UNIQUE, btree (iccid)
WHERE s.id = (SELECT sd.id
FROM sample_data sd
WHERE sd.device_id = d.id and
sd.time BETWEEN periods.st AND periods.en
ORDER BY sd.sample ASC
LIMIT 1
)
您在样本数据(设备id、时间、样本)
上有一个索引,您希望数据库引擎使用该索引。不幸的是,它只能对where
子句充分利用索引。由于之间存在,因此它可能不会使用
排序依据的索引
Column | Type | Modifiers
--------------------+-----------------------------+------------------------------------------------------
id | integer | not null default nextval('devices_id_seq'::regclass)
customer_id | integer |
<Other fields skipped as they are not involved into the query>
Indexes:
"devices_pkey" PRIMARY KEY, btree (id)
"index_devices_on_iccid" UNIQUE, btree (iccid)
是否可以使用时间通过写入订单
Column | Type | Modifiers
--------------------+-----------------------------+------------------------------------------------------
id | integer | not null default nextval('devices_id_seq'::regclass)
customer_id | integer |
<Other fields skipped as they are not involved into the query>
Indexes:
"devices_pkey" PRIMARY KEY, btree (id)
"index_devices_on_iccid" UNIQUE, btree (iccid)
WHERE s.id = (SELECT id
FROM sample_data
WHERE device_id = d.id and
time BETWEEN periods.st AND periods.en
ORDER BY time ASC
LIMIT 1
)
明白了,谢谢。不幸的是,我不能。“样本”字段用于测量值,我使用它的排序来获得一段时间内的最小值和最大值。@eug.nikolaev。好吧,这确实证明了Postgres会用索引做正确的事情。那么,通过查询获得这种数据是可能的吗?如果没有-看起来我必须在数据插入上准备max mins..Gordon Linoff,谢谢你的解释!决定不在那里使用排序。它(目前)会起作用的。