如何在SQL Server中优化或重写此查询?
我正在使用JPA2.1和SQL Server数据库。以下是表(语法是针对PostgreSQL的,因为它目前最容易为我编写): 我想选择没有特定名称订阅的所有活动(未锁定)用户。用户可能没有订阅,也可能有多个订阅 目前我的JPQL查询的SQL版本是如何在SQL Server中优化或重写此查询?,sql,sql-server,Sql,Sql Server,我正在使用JPA2.1和SQL Server数据库。以下是表(语法是针对PostgreSQL的,因为它目前最容易为我编写): 我想选择没有特定名称订阅的所有活动(未锁定)用户。用户可能没有订阅,也可能有多个订阅 目前我的JPQL查询的SQL版本是 SELECT * FROM users WHERE locked = false AND id NOT IN (SELECT id_user FROM subscriptions
SELECT *
FROM users
WHERE locked = false
AND id NOT IN (SELECT id_user
FROM subscriptions
WHERE name = 'premium')
我在某个地方读到,SQL Server将对外部SELECT的每个结果行执行嵌套SELECT。即使嵌套SELECT的结果集不会像本例中那样在执行查询时发生更改,这是否为真 随着表的增长,此查询具有可怕的运行时性能。如何在SQL Server中重写或调优此查询?可能使用联接?通过将查询转换为
不存在()
,您可能会获得一些性能:
您还可以通过在订阅上设置适当的支持索引来提高性能,例如:
create nonclustered index ix_subscriptions_id_user_name
on dbo.subscriptions (id_user)
include (name);
您可以更进一步,将其作为过滤索引,但这可能不会显著提高性能 假设已在联接列上创建索引,请尝试:
SELECT *
FROM users AS A
LEFT JOIN subscriptions as B
ON B.id_user = A.id
WHERE A.locked = 'false' AND B.name != 'premium' AND B.id_user IS NULL
在某个地方读到SQL Server将对外部SELECT的每个结果行执行嵌套SELECT
一点也不正确
此查询:
SELECT *
FROM users
WHERE locked = false
AND id NOT IN (SELECT id_user
FROM subscriptions
WHERE name != 'premium')
将为您提供所有未锁定且没有非特优订阅的用户
要获取所有没有高级订阅的未锁定用户,请执行以下操作:
SELECT *
FROM users u
WHERE locked = 0
AND NOT EXISTS (SELECT *
FROM subscriptions
where id_user = u.id
and name = 'premium')
要在SQL Server中测试类似内容,请获取SQL Server Management Studio(或类似工具),并运行以下脚本:
use tempdb
go
--drop table subscriptions
--drop table users
go
CREATE TABLE users (
id int primary key,
locked bit,
name varchar(20)
);
CREATE TABLE subscriptions (
id int primary key,
name varchar(20),
id_user int references users
);
go
with N as
(
select row_number() over (order by (select null)) i
from sys.objects o, sys.columns c, sys.columns c2
)
insert into users (id,locked,name)
select top 10000 i, case when i%100 = 0 then 1 else 0 end, concat('user',i)
from N;
with N as
(
select row_number() over (order by (select null)) i
from sys.objects o, sys.columns c, sys.columns c2
)
insert into subscriptions(id,name,id_user)
select top 100000 i, case when i%100 = 0 then 'premium' else 'standard' end, i%10000 + 1
from N;
go
SELECT *
FROM users u
WHERE locked = 0
AND NOT EXISTS (SELECT *
FROM subscriptions
where id_user = u.id
and name = 'premium')
最好使用
左连接
,当然如果在连接条件下创建索引,查询速度会更快
SELECT u.*
FROM users u
LEFT JOIN subscriptions s
ON s.id_user = u.id
WHERE u.locked = 'false'
AND s.name != 'premium'
AND s.id_user IS NULL
首先在订阅上创建聚集索引。如果表非常大,在堆上维护索引可能是一场噩梦。我可能会建议在id_用户上进行集群
CREATE TABLE subscriptions
(
id INT IDENTITY(1,1) PRIMARY KEY NONCLUSTERED,
name VARCHAR(20),
id_user INT
);
CREATE CLUSTERED INDEX CL_id_User ON Subscriptions (id_User)
CREATE TABLE users
(
id INT IDENTITY(1,1),
locked BIT,
name varchar(20),
CONSTRAINT PK_users PRIMARY KEY CLUSTERED (id)
);
然后在订阅和用户上创建一个非聚集索引。如果要从两个表中提取所有列,则使用Includes
CREATE NONCLUSTERED INDEX IX_Users_Locked ON Users (Locked) INCLUDE (Name);
CREATE NONCLUSTERED INDEX IX_Subscriptions_Name ON Subscriptions (name);
然后让您的查询如下所示-
SELECT *
FROM users u
WHERE u.id NOT IN (SELECT s.id_user
FROM Subscriptions s
WHERE s.name = 'Premium')
AND u.locked = 0;
更进一步,使用ID来确定订阅类型。在SQL Server中,索引整数远远优于索引字符串。SQL Server优化器是基于成本的,因此它(理论上)应该为语义相同的查询生成最佳计划。向问题添加主键、唯一约束和索引。缺少适当的索引可能会导致性能问题。表大小导致性能下降的症状表明了完全扫描。“我在某处读到SQL Server将对外部SELECT的每个结果行执行嵌套SELECT”-不正确。它将尽可能地使用基于集合的操作。@惊奇的DCONUT您能发布示例数据吗?虽然上面的注释是正确的,但您可能可以使用简单的联接重新编写它。尽管您将此问题标记为
sql server
,但数据类型表明使用了不同的DBMS(PostgreSQL?)。请相应地编辑标签。无论如何,subscriptions
表name
列上的索引可能会提高性能。
CREATE NONCLUSTERED INDEX IX_Users_Locked ON Users (Locked) INCLUDE (Name);
CREATE NONCLUSTERED INDEX IX_Subscriptions_Name ON Subscriptions (name);
SELECT *
FROM users u
WHERE u.id NOT IN (SELECT s.id_user
FROM Subscriptions s
WHERE s.name = 'Premium')
AND u.locked = 0;