SQL:使用冗余联合代码优化嵌套查询?
尽管下面的查询对于我的使用效果非常好,但我试图了解是否有优化的方法,因为它在UNION的两个语句中使用相同的嵌套子查询。我的直觉告诉我,应该有一种方法来查找子查询一次,并对联合的两个部分使用该结果,但我在试图跨联合使用join时遇到了语法问题SQL:使用冗余联合代码优化嵌套查询?,sql,tsql,subquery,query-optimization,union,Sql,Tsql,Subquery,Query Optimization,Union,尽管下面的查询对于我的使用效果非常好,但我试图了解是否有优化的方法,因为它在UNION的两个语句中使用相同的嵌套子查询。我的直觉告诉我,应该有一种方法来查找子查询一次,并对联合的两个部分使用该结果,但我在试图跨联合使用join时遇到了语法问题 SELECT ClientLocalName, ClientLocalID, ClientMasterID, 1 HierarchyType FROM dbo.ClientLocal WHERE clientLocalID =
SELECT ClientLocalName, ClientLocalID, ClientMasterID, 1 HierarchyType
FROM dbo.ClientLocal
WHERE clientLocalID =
ANY (SELECT HierarchyItem FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%0123456789')
UNION
SELECT ClientMasterName, '0', clientMasterID, 2
FROM dbo.ClientMaster
WHERE clientMasterID =
ANY (SELECT HierarchyItem FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%0123456789')
ORDER BY HierarchyType ASC;
为了帮助解释为什么需要原始查询,想象一下本地客户机和主客户机的概念——本地客户机与主客户机绑定,但主客户机可能有也可能没有任何本地客户机。同样,与每个客户端关联的电话号码只应绑定到本地客户端或主客户端,因此在第二条语句中,本地客户端ID的“0”是硬编码的,因为没有关联
顺便说一下,这将针对Microsoft SQL Server 2008 R2运行。如果您不喜欢重复代码,可以使用公共表表达式:
WITH cte AS (
SELECT HierarchyItem FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%0123456789'
)
SELECT ClientLocalName, ClientLocalID, ClientMasterID, 1 HierarchyType
FROM dbo.ClientLocal
WHERE clientLocalID IN (SELECT HierarchyItem FROM cte)
UNION
SELECT ClientMasterName, '0', clientMasterID, 2
FROM dbo.ClientMaster
WHERE clientMasterID IN (SELECT HierarchyItem FROM cte)
ORDER BY HierarchyType ASC;
我能看到的唯一明显的优化是使用union all而不是union: 由于最后一列的原因,子选择之间显然没有重复项。这假定在每个子选择中都没有重复项 我不知道SQL Server对=ANY的处理方式,但我假设它与中的基本相同。=ANY子查询将为每个子查询运行一次,但第二次将缓存结果。您可以将其简化为:
SELECT ClientLocalName, ClientLocalID, ClientMasterID, 1 HierarchyType
FROM ((SELECT ClientLocalName, ClientLocalID, ClientMasterID
FROM dbo.ClientLocal
) UNION ALL
(SELECT ClientLocalName, '0', ClientMasterID
FROM dbo.ClientMasterName
)
) c
WHERE c.clientLocalID = ANY (SELECT HierarchyItem FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%0123456789')
ORDER BY HierarchyType ASC;
我不知道这是否会有很大的改善
LIKE模式在数字的开头有一个通配符,这使得它很难优化。因为似乎没有人提到它:代码更少并不一定意味着它的执行速度也会更快。人们常常认为CTE是首先执行的,或者是在有多个CTE的情况下按顺序执行,然后应用最终的查询;事实并非如此。事实上,服务器将处理cte,就像它是FROM或JOIN子句中的某个派生表一样,因此选择完全相同的查询计划来执行它们。为了避免系统必须像“%0123…”一样扫描ClientPhone表两次,这是一个非常“烦人”的操作,您必须先扫描一次,然后将结果存储在临时表中,然后在实际查询中使用所述临时表 为了提供一些上下文,我冒昧地创建了一些示例数据,请参见下面的代码。我的测试结果是: 原始查询需要71毫秒 CTE版本需要71毫秒,并使用完全相同的查询计划 创建临时表需要26毫秒,使用它的查询需要23毫秒 因此,即使服务器需要创建临时表并填充它,合并后的临时表仍然只需要49毫秒,而不是原来的71毫秒 当然,根据涉及的记录数量和重复查询的复杂性,您的里程数可能会有所不同,正如我所说的,像“%blah”这样的重复查询是一件令人讨厌的事情,因为它需要对表进行完整扫描,或者需要一点覆盖索引。如果是在pk_field=@value的位置,效果可能会非常不同 快乐的询问
IF DB_ID('test') IS NULL CREATE DATABASE test
GO
USE test
GO
-- setup
IF OBJECT_ID('ClientPhone') IS NOT NULL DROP TABLE ClientPhone
IF OBJECT_ID('ClientLocal') IS NOT NULL DROP TABLE ClientLocal
IF OBJECT_ID('ClientMaster') IS NOT NULL DROP TABLE ClientMaster
GO
SELECT TOP 50000
HierarchyItem = IDENTITY(int, 1, 1),
PhoneNumber = Convert(varchar(100), NewID())
INTO dbo.ClientPhone
FROM sys.objects x1, sys.columns x2, sys.objects x3, sys.columns x4
SELECT ClientLocalName = 'client dummy',
ClientLocalID = Convert(int, Rand(HierarchyItem * 37) * 50000),
ClientMasterID = Convert(int, Rand(HierarchyItem * 47) * 50000),
HierarchyType = 1
INTO dbo.ClientLocal
FROM dbo.ClientPhone
SELECT ClientMasterName = 'master dummy',
ClientLocalID = Convert(int, Rand(HierarchyItem * 51) * 50000),
ClientMasterID = Convert(int, Rand(HierarchyItem * 53) * 50000),
HierarchyType = 2
INTO dbo.ClientMaster
FROM dbo.ClientPhone
GO
-- original
SELECT ClientLocalName, ClientLocalID, ClientMasterID, 1 HierarchyType
FROM dbo.ClientLocal
WHERE ClientLocalID = ANY (SELECT HierarchyItem FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%01')
UNION
SELECT ClientMasterName, '0', ClientMasterID, 2
FROM dbo.ClientMaster
WHERE ClientMasterID = ANY (SELECT HierarchyItem FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%01')
ORDER BY HierarchyType ASC;
-- CTE
;WITH cte
AS (SELECT HierarchyItem FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%01')
SELECT ClientLocalName, ClientLocalID, ClientMasterID, 1 HierarchyType
FROM dbo.ClientLocal
WHERE ClientLocalID = ANY (SELECT HierarchyItem FROM cte)
UNION
SELECT ClientMasterName, '0', ClientMasterID, 2
FROM dbo.ClientMaster
WHERE ClientMasterID = ANY (SELECT HierarchyItem FROM cte)
ORDER BY HierarchyType ASC;
-- temp table
IF OBJECT_ID('tempdb..#ClientPhone') IS NOT NULL DROP TABLE #ClientPhone
SELECT HierarchyItem INTO #ClientPhone FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%01'
SELECT ClientLocalName, ClientLocalID, ClientMasterID, 1 HierarchyType
FROM dbo.ClientLocal
WHERE ClientLocalID = ANY (SELECT HierarchyItem FROM #ClientPhone)
UNION
SELECT ClientMasterName, '0', ClientMasterID, 2
FROM dbo.ClientMaster
WHERE ClientMasterID = ANY (SELECT HierarchyItem FROM #ClientPhone)
ORDER BY HierarchyType ASC;
仅仅因为您重复了代码,并不一定意味着它会被重复计算。优化器很可能会发现这是重复的,并重用第一次评估的结果。除非如此,否则重新评估似乎更为合适。如果只是重复代码,请使用提供的CTE解决方案。此查询中可能有很多地方需要优化。首先,我不知道该查询如何运行,因为除非这些列具有相同的名称,否则不能合并ClientLocalName和ClientMasterName。你能提供一点样本数据和你的预期结果吗?ClientPhone的CTE可以工作,但这实际上就是您要获取的数据吗?@Shawn只要数据类型在各个联合中相同,列的名称就无关紧要。这个查询确实工作得很好。无论如何,卢卡斯的回答提供了我想要的。@stickybit是正确的,但真正的代码在重复的代码中会有更多的条件,因此这里有一个很好的动机来避免重复。@stickybit:事实上,重复的代码肯定会被计算多次。当然,基本数据只需要加载到缓存中一次,但实际执行的频率与要求的频率相同。请参阅下面的示例。值得一提的是,将UNION更改为UNION ALL可以更改总体结果,如果子结果中存在重复项,则选择作为UNION或UNION ALL的操作数。列ClientLocalID和ClientMasterID的命名表明,子结果中存在PK,因此不会出现重复。但是,如果这是一个错误的印象,则在使用UNION ALL时,需要使用SELECT DISTINCT使子结果自己不重复,这可能会更快,因为重复数据消除的集合更小。不幸的是,基本的
SELECT ClientLocalName, ClientLocalID, ClientMasterID, 1 HierarchyType
FROM ((SELECT ClientLocalName, ClientLocalID, ClientMasterID
FROM dbo.ClientLocal
) UNION ALL
(SELECT ClientLocalName, '0', ClientMasterID
FROM dbo.ClientMasterName
)
) c
WHERE c.clientLocalID = ANY (SELECT HierarchyItem FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%0123456789')
ORDER BY HierarchyType ASC;
IF DB_ID('test') IS NULL CREATE DATABASE test
GO
USE test
GO
-- setup
IF OBJECT_ID('ClientPhone') IS NOT NULL DROP TABLE ClientPhone
IF OBJECT_ID('ClientLocal') IS NOT NULL DROP TABLE ClientLocal
IF OBJECT_ID('ClientMaster') IS NOT NULL DROP TABLE ClientMaster
GO
SELECT TOP 50000
HierarchyItem = IDENTITY(int, 1, 1),
PhoneNumber = Convert(varchar(100), NewID())
INTO dbo.ClientPhone
FROM sys.objects x1, sys.columns x2, sys.objects x3, sys.columns x4
SELECT ClientLocalName = 'client dummy',
ClientLocalID = Convert(int, Rand(HierarchyItem * 37) * 50000),
ClientMasterID = Convert(int, Rand(HierarchyItem * 47) * 50000),
HierarchyType = 1
INTO dbo.ClientLocal
FROM dbo.ClientPhone
SELECT ClientMasterName = 'master dummy',
ClientLocalID = Convert(int, Rand(HierarchyItem * 51) * 50000),
ClientMasterID = Convert(int, Rand(HierarchyItem * 53) * 50000),
HierarchyType = 2
INTO dbo.ClientMaster
FROM dbo.ClientPhone
GO
-- original
SELECT ClientLocalName, ClientLocalID, ClientMasterID, 1 HierarchyType
FROM dbo.ClientLocal
WHERE ClientLocalID = ANY (SELECT HierarchyItem FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%01')
UNION
SELECT ClientMasterName, '0', ClientMasterID, 2
FROM dbo.ClientMaster
WHERE ClientMasterID = ANY (SELECT HierarchyItem FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%01')
ORDER BY HierarchyType ASC;
-- CTE
;WITH cte
AS (SELECT HierarchyItem FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%01')
SELECT ClientLocalName, ClientLocalID, ClientMasterID, 1 HierarchyType
FROM dbo.ClientLocal
WHERE ClientLocalID = ANY (SELECT HierarchyItem FROM cte)
UNION
SELECT ClientMasterName, '0', ClientMasterID, 2
FROM dbo.ClientMaster
WHERE ClientMasterID = ANY (SELECT HierarchyItem FROM cte)
ORDER BY HierarchyType ASC;
-- temp table
IF OBJECT_ID('tempdb..#ClientPhone') IS NOT NULL DROP TABLE #ClientPhone
SELECT HierarchyItem INTO #ClientPhone FROM dbo.ClientPhone WHERE PhoneNumber LIKE '%01'
SELECT ClientLocalName, ClientLocalID, ClientMasterID, 1 HierarchyType
FROM dbo.ClientLocal
WHERE ClientLocalID = ANY (SELECT HierarchyItem FROM #ClientPhone)
UNION
SELECT ClientMasterName, '0', ClientMasterID, 2
FROM dbo.ClientMaster
WHERE ClientMasterID = ANY (SELECT HierarchyItem FROM #ClientPhone)
ORDER BY HierarchyType ASC;