Postgresql 是否有任何方法可以通过postgis更好地索引地理查询以提高查询性能
我有一个相当大(200万条记录)的表,名为Postgresql 是否有任何方法可以通过postgis更好地索引地理查询以提高查询性能,postgresql,postgis,Postgresql,Postgis,我有一个相当大(200万条记录)的表,名为places。我使用两个名为纬度和经度的数值(9,6)列来存储地理位置 现在我经常需要问:“距离一个点x公里(半径)内有多少个地方?” 我使用如下查询执行此操作: SELECT COUNT(*) AS active_count FROM de."places" WHERE "places"."state" = 'active' AND (extensions.ST_DWithin( extensions.ST_GeographyFromText( '
places
。我使用两个名为纬度和经度的数值(9,6)
列来存储地理位置
现在我经常需要问:“距离一个点x公里(半径)内有多少个地方?”
我使用如下查询执行此操作:
SELECT COUNT(*) AS active_count
FROM de."places"
WHERE "places"."state" = 'active'
AND (extensions.ST_DWithin( extensions.ST_GeographyFromText( 'SRID=4326;POINT(' || places.longitude || ' ' || places.latitude || ')' ), extensions.ST_GeographyFromText('SRID=4326;POINT(9.157190 48.808670)'), 15000 ))
我的索引如下所示:
CREATE INDEX index_places_location
ON de.places USING gist
(extensions.st_geographyfromtext(((('SRID=4326;POINT('::text || longitude) || ' '::text) || latitude) || ')'::text))
TABLESPACE pg_default WHERE state::text = 'active'::text
;
我有非常坚固的硬件(64核、192GB ram、HW raid阵列中的8x企业级SSD等)
现在,如果我做一个解释,我会得到如下结果:
"Finalize Aggregate (cost=512320.91..512320.92 rows=1 width=8) (actual time=1677.327..1677.327 rows=1 loops=1)"
" -> Gather (cost=512320.28..512320.89 rows=6 width=8) (actual time=1675.946..1732.657 rows=7 loops=1)"
" Workers Planned: 6"
" Workers Launched: 6"
" -> Partial Aggregate (cost=511320.28..511320.29 rows=1 width=8) (actual time=1655.383..1655.384 rows=1 loops=7)"
" -> Parallel Bitmap Heap Scan on places (cost=125298.79..511310.07 rows=4085 width=0) (actual time=1506.195..1655.008 rows=3781 loops=7)"
" Recheck Cond: ((extensions.st_geographyfromtext((((('SRID=4326;POINT('::text || (longitude)::text) || ' '::text) || (latitude)::text) || ')'::text)) OPERATOR(extensions.&&) '0101000020E610000038842A357B502240CFA0A17F82674840'::extensions.geography) AND ((state)::text = 'active'::text))"
" Filter: (('0101000020E610000038842A357B502240CFA0A17F82674840'::extensions.geography OPERATOR(extensions.&&) extensions._st_expand(extensions.st_geographyfromtext((((('SRID=4326;POINT('::text || (longitude)::text) || ' '::text) || (latitude)::text) || ')'::text)), '15000'::double precision)) AND extensions._st_dwithin(extensions.st_geographyfromtext((((('SRID=4326;POINT('::text || (longitude)::text) || ' '::text) || (latitude)::text) || ')'::text)), '0101000020E610000038842A357B502240CFA0A17F82674840'::extensions.geography, '15000'::double precision, true))"
" Rows Removed by Filter: 1380"
" Heap Blocks: exact=12774"
" -> Bitmap Index Scan on index_places_location (cost=0.00..125292.67 rows=367634 width=0) (actual time=1501.179..1501.179 rows=89886 loops=1)"
" Index Cond: (extensions.st_geographyfromtext((((('SRID=4326;POINT('::text || (longitude)::text) || ' '::text) || (latitude)::text) || ')'::text)) OPERATOR(extensions.&&) '0101000020E610000038842A357B502240CFA0A17F82674840'::extensions.geography)"
"Planning Time: 0.786 ms"
"Execution Time: 1732.762 ms"
我不知道你是怎么想的,但我还是希望能快一点。现在我是不是忽略了我应该使用的超级智能PostGIS索引功能,而不是我正在做的
PS:Postgresql 11和Postgis 2.5
更新
SET enable_bitmapscan = off;
explain (analyze, buffers) SELECT COUNT(*) AS active_count
FROM de."places" WHERE "places"."state" = 'active'
AND (extensions.ST_DWithin( extensions.ST_GeographyFromText( 'SRID=4326;POINT(' || places.longitude || ' ' || places.latitude || ')' ), extensions.ST_GeographyFromText('SRID=4326;POINT(9.157190 48.808670)'), 15000 ))
输出:
"Aggregate (cost=642181.64..642181.65 rows=1 width=8) (actual time=354.662..354.662 rows=1 loops=1)"
" Buffers: shared hit=96669 dirtied=2043"
" -> Index Scan using index_places_location on places (cost=0.41..642120.37 rows=24509 width=0) (actual time=2.079..351.946 rows=26461 loops=1)"
" Index Cond: (extensions.st_geographyfromtext((((('SRID=4326;POINT('::text || (longitude)::text) || ' '::text) || (latitude)::text) || ')'::text)) OPERATOR(extensions.&&) '0101000020E610000038842A357B502240CFA0A17F82674840'::extensions.geography)"
" Filter: (('0101000020E610000038842A357B502240CFA0A17F82674840'::extensions.geography OPERATOR(extensions.&&) extensions._st_expand(extensions.st_geographyfromtext((((('SRID=4326;POINT('::text || (longitude)::text) || ' '::text) || (latitude)::text) || ')'::text)), '15000'::double precision)) AND extensions._st_dwithin(extensions.st_geographyfromtext((((('SRID=4326;POINT('::text || (longitude)::text) || ' '::text) || (latitude)::text) || ')'::text)), '0101000020E610000038842A357B502240CFA0A17F82674840'::extensions.geography, '15000'::double precision, true))"
" Rows Removed by Filter: 9660"
" Buffers: shared hit=96669 dirtied=2043"
"Planning Time: 1.149 ms"
"Execution Time: 354.732 ms"
这更快了,为什么?您的位图扫描有点奇怪。索引扫描发现89886行,过滤器删除了1380行,但您只剩下3781行。我不得不得出结论,您的表处于真空状态。我不确定这些数字中的哪些是所有并行工作进程报告的,哪些是一个,但我认为这不可能大到足以解释这种差异
您是否交替重复运行这两个查询以确保结果不是偶然的或由于缓存效应?(另外,请遵循Laurenz的建议,并在可能的情况下首先打开track_io_timing)尝试设置enable_bitmapscan=off;
看看这是否会加快查询速度。您能否发布解释(分析,缓冲区)
原始查询的输出和没有位图索引扫描的执行?速度更快,为什么?可能是缓存。我能看看解释(分析,缓冲区)
原始查询的输出吗?@LaurenzAlbe和@jjanes缓存确实在进行。第一次(很久以后)我运行查询需要3到6秒的时间,无论enable_bitmapscan是打开
还是关闭
。现在,如果我在(改变坐标)后立即运行查询,在bitmapscan打开/关闭的情况下,它都会在50-300ms内成功。这告诉我什么?