SQL优化问题
让我们从中定义的场景开始 现在我想创建一个查询,生成SQL优化问题,sql,sql-server,relational-database,Sql,Sql Server,Relational Database,让我们从中定义的场景开始 现在我想创建一个查询,生成FoosF1的列表,并且FoosF2的计数与F1不同,但与相同的Bar或BazF1关联: SELECT F1.*, CASE WHEN F1.Bar_ID IS NOT NULL THEN ISNULL(Bar.LotNumber + '-', '') + Bar.ItemNumber WHEN F2.Baz_ID IS NOT NULL TH
Foo
sF1
的列表,并且Foo
sF2
的计数与F1
不同,但与相同的Bar
或Baz
F1
关联:
SELECT F1.*,
CASE
WHEN F1.Bar_ID IS NOT NULL THEN
ISNULL(Bar.LotNumber + '-', '') + Bar.ItemNumber
WHEN F2.Baz_ID IS NOT NULL THEN
ISNULL(Baz.Color + ' ', '') + Baz.Type
END AS 'Ba?Description',
(SELECT COUNT(*)
FROM Foo F2
WHERE F2.Bar_ID = F1.Bar_ID
OR F2.Baz_ID = F1.Baz_ID) - 1 AS FooCount
FROM Foo F1
LEFT JOIN Bar ON Bar.Bar_ID = F1.Bar_ID
LEFT JOIN Baz ON Baz.Baz_ID = F1.Baz_ID
我担心的是效率。我必须承认,我对SQL Server如何从SQL语句生成执行计划一无所知,但常识告诉我,子查询将对主查询中的每一行执行一次,即对F1.Foo\u ID
的每一个值执行一次。这显然是没有效率的
另一种选择是不遇到这个问题的是
SELECT F1.*,
CASE
WHEN F1.Bar_ID IS NOT NULL THEN
ISNULL(Bar.LotNumber + '-', '') + Bar.ItemNumber
WHEN F2.Baz_ID IS NOT NULL THEN
ISNULL(Baz.Color + ' ', '') + Baz.Type
END AS 'Ba?Description',
COUNT(*) - 1 AS FooCount
FROM Foo F1
LEFT JOIN Bar ON Bar.Bar_ID = F1.Bar_ID
LEFT JOIN Baz ON Baz.Baz_ID = F1.Baz_ID
LEFT JOIN Foo F2 ON F2 .Bar_ID = F1.Bar_ID
OR F2 .Baz_ID = F1.Baz_ID
GROUP BY F1.Foo_ID, F1.SomeFooField, F1.SomeOtherField, ...,
CASE
WHEN F1.Bar_ID IS NOT NULL THEN
ISNULL(Bar.LotNumber + '-', '') + Bar.ItemNumber
WHEN F2.Baz_ID IS NOT NULL THEN
ISNULL(Baz.Color + ' ', '') + Baz.Type
END
但这更糟糕,因为它遇到了一个更大的问题,这与SQL数据库不是真正的关系数据库这一事实有关。如果SQL数据库是真正的关系数据库,那么SQL引擎将能够推断出不受聚合函数影响的每个字段的值都是由F1.Foo\u ID
唯一确定的。因此,按F1.Foo_ID分组应足以产生所需的结果。但是SQL仍然强制我显式地按分组每个不受聚合函数影响的字段。结果如何?效率低下
第三个没有遇到前面两个问题的选择是
SELECT Foo.*,
CASE
WHEN Foo.Bar_ID IS NOT NULL THEN
ISNULL(Bar.LotNumber + '-', '') + Bar.ItemNumber
WHEN Foo.Baz_ID IS NOT NULL THEN
ISNULL(Baz.Color + ' ', '') + Baz.Type
END AS 'Ba?Description',
ISNULL(Temp.FooCount, 0) AS FooCount
FROM Foo
LEFT JOIN Bar ON Bar.Bar_ID = Foo.Bar_ID
LEFT JOIN Baz ON Baz.Baz_ID = Foo.Baz_ID
LEFT JOIN (SELECT F1.Foo_ID, COUNT(*) - 1 AS FooCount
FROM Foo F1
JOIN Foo F2 ON F2.Bar_ID = F1.Bar_ID
OR F2.Baz_ID = F1.Baz_ID
GROUP BY F1.Foo_ID) Temp ON Temp.Foo_ID = Foo.Foo_ID
但这样做的缺点是需要在内存中实例化三个Foo
,而不仅仅是两个副本
我应该如何组织我的查询,以尽可能最有效的方式产生所需的结果?我同意评论中的说法,即您只能通过尝试才能找到答案。在你的另一篇文章中,你说你没有可用的测试数据。我猜你不知道如何生成测试数据。我给你看看
我假设存在以下表格:
create table Bar (
Bar_ID int not null primary key,
LotNumber varchar(10),
ItemNumber varchar(10)
)
create table Baz (
Baz_ID int not null primary key,
Color varchar(10),
Type varchar(10)
)
create table Foo (
Foo_ID int not null primary key,
Bar_ID int null references Bar,
Baz_ID int null references Baz,
SomeFooField varchar(10),
SomeOtherFooField varchar(10)
)
现在用测试数据填充条形图:
insert into Bar (Bar_ID) values (0)
insert into Bar (Bar_ID) select Bar_ID + 1 from Bar
insert into Bar (Bar_ID) select Bar_ID + 2 from Bar
insert into Bar (Bar_ID) select Bar_ID + 4 from Bar
insert into Bar (Bar_ID) select Bar_ID + 8 from Bar
insert into Bar (Bar_ID) select Bar_ID + 16 from Bar
insert into Bar (Bar_ID) select Bar_ID + 32 from Bar
insert into Bar (Bar_ID) select Bar_ID + 64 from Bar
-- etc.
update Bar set
LotNumber = 'LN_' + convert(varchar(10), Bar_ID),
ItemNumber = 'IN_' + convert(varchar(10), Bar_ID)
填充Baz:
insert into Baz (Baz_ID) values (0)
insert into Baz (Baz_ID) select Baz_ID + 1 from Baz
insert into Baz (Baz_ID) select Baz_ID + 2 from Baz
insert into Baz (Baz_ID) select Baz_ID + 4 from Baz
insert into Baz (Baz_ID) select Baz_ID + 8 from Baz
insert into Baz (Baz_ID) select Baz_ID + 16 from Baz
insert into Baz (Baz_ID) select Baz_ID + 32 from Baz
-- etc
update Baz set
Color = 'C_' + convert(varchar(10), Baz_ID),
Type = 'T_' + convert(varchar(10), Baz_ID)
把一些数据放在Foo中
insert into Foo (Foo_ID) values (0)
insert into Foo (Foo_ID) select Foo_ID + 1 from Foo
insert into Foo (Foo_ID) select Foo_ID + 2 from Foo
insert into Foo (Foo_ID) select Foo_ID + 4 from Foo
insert into Foo (Foo_ID) select Foo_ID + 8 from Foo
insert into Foo (Foo_ID) select Foo_ID + 16 from Foo
insert into Foo (Foo_ID) select Foo_ID + 32 from Foo
insert into Foo (Foo_ID) select Foo_ID + 64 from Foo
insert into Foo (Foo_ID) select Foo_ID + 128 from Foo
insert into Foo (Foo_ID) select Foo_ID + 256 from Foo
-- etc...
update Foo set
SomeFooField = 'SFF_' + convert(varchar(10), Foo_ID),
SomeOtherFooField = 'SOFF_' + convert(varchar(10), Foo_ID)
update Foo set Bar_ID = Bar.Bar_ID
from Bar
where Foo_ID % 128 = Bar.Bar_ID
and Foo_ID % 3 = 0;
update Foo set Baz_ID = Baz.Baz_ID
from Baz
where Foo_ID % 64 = Baz.Baz_ID
and Foo_ID % 3 <> 0
现在可以测试查询了。我建议你也试试这个:
select F.Foo_id, F.*,
isNull(R.barDescription, Z.bazDescription) as 'Ba?Description',
isnull(R.fooCount, Z.fooCount) - 1 as fooCount
from Foo F
left join (
select F.Bar_ID,
ISNULL(Bar.LotNumber + '-', '') + Bar.ItemNumber as 'BarDescription',
count(F.Foo_id) as FooCount
from Foo F, Bar
where F.Bar_id = Bar.Bar_id
group by F.Bar_id, Bar.LotNumber, Bar.ItemNumber
) R on F.Bar_ID = R.Bar_ID
left join (
select F.Baz_ID,
ISNULL(Baz.Color + '-', '') + Baz.Type as 'BazDescription',
count(F.Foo_id) as FooCount
from Foo F, Baz
where F.Baz_id = Baz.Baz_id
group by F.Baz_id, Baz.Color, Baz.Type
) Z on F.Baz_ID = Z.Baz_ID
在我的旧版本SQL查询分析器中,有一个选项“显示生成的执行计划”。你的版本也可能有这个选项。这表明上面的查询将比您建议的3个查询运行得更快。但这是理论!因此,请在表中填入您认为将在生产系统中包含的尽可能多的数据,然后再试一次。您不能通过简单地读取sql本身来做出这样的假设。查询优化器将根据索引、表统计信息和服务器规格发挥各种魔力。如果您想知道哪一个最有效,那么在针对具有最新统计信息的代表性数据集运行查询时,您必须查看每个查询的执行计划。生成一些测试数据并运行基准测试,这是获得有用答案的唯一方法。无法为“效率”设计数据库而不是设计它来准确地反映数据的性质。此外,设计数据库是与设计查询完全隔离的活动。我想我们需要从另一个角度开始讨论。你似乎在期待水晶球的神奇表演。正如多次建议的那样,我将重复,准确预测这些查询性能的唯一方法是用测试数据加载您的模式并尝试它们。并不是说我期望这件事会突然发生,但你认为所有这些人都在骗你吗?守卫一些古老的秘密巫术,毫无疑问地告诉他们哪一个查询会更好?他们知道的并不比你多,因为“这取决于”变量太多了。@Aaron-SQL伏都教俱乐部的第一条规则是“不要谈论SQL伏都教俱乐部”。你的会员资格岌岌可危。我知道如何生成测试数据。但是,我的测试数据足够真实吗?你必须确保你的测试数据足够真实。。。。。。假设您将拥有的行数,相应地填充表,运行查询并比较结果。使用SQL查询分析器查看查询的执行计划。试试看!实验!我发现您的第一个查询比第二个和第三个查询执行得好得多。如果你研究执行计划,你就会知道原因。
select F.Foo_id, F.*,
isNull(R.barDescription, Z.bazDescription) as 'Ba?Description',
isnull(R.fooCount, Z.fooCount) - 1 as fooCount
from Foo F
left join (
select F.Bar_ID,
ISNULL(Bar.LotNumber + '-', '') + Bar.ItemNumber as 'BarDescription',
count(F.Foo_id) as FooCount
from Foo F, Bar
where F.Bar_id = Bar.Bar_id
group by F.Bar_id, Bar.LotNumber, Bar.ItemNumber
) R on F.Bar_ID = R.Bar_ID
left join (
select F.Baz_ID,
ISNULL(Baz.Color + '-', '') + Baz.Type as 'BazDescription',
count(F.Foo_id) as FooCount
from Foo F, Baz
where F.Baz_id = Baz.Baz_id
group by F.Baz_id, Baz.Color, Baz.Type
) Z on F.Baz_ID = Z.Baz_ID