Sql 使用子查询与字符串进行联接的查询时间
我有这样一个问题:Sql 使用子查询与字符串进行联接的查询时间,sql,sql-server-2005,join,Sql,Sql Server 2005,Join,我有这样一个问题: SELECT * FROM table1 t1 JOIN table2 t2 ON t1.id = t2.id AND t2.time IN (SELECT year FROM activeYears WHERE active = 1) 其中activeYears.year为NVARCHAR50,则每年为一行 为什么该联接的运行时间比此查询快: SELECT * FROM table1 t1 JOIN table2 t2 ON t1.id = t2.id AND
SELECT * FROM table1 t1
JOIN table2 t2 ON t1.id = t2.id
AND t2.time IN (SELECT year FROM activeYears WHERE active = 1)
其中activeYears.year为NVARCHAR50,则每年为一行
为什么该联接的运行时间比此查询快:
SELECT * FROM table1 t1
JOIN table2 t2 ON t1.id = t2.id AND t2.time IN ('2009','2010')
这是一个简短的版本,基本上我有一个使用子查询的带有连接的大型查询。当我将子查询更改为字符串进行测试时,运行时间会增加一倍,即使在清除缓存时也是如此。我认为这可能是一个强制转换问题,但我也尝试将两个变量声明为NVARCHAR50,并在查询中使用它们,结果没有什么不同
这让我困惑了几天,我不明白为什么子查询速度更快,除非查询实际上是以不同的方式构建的
谢谢
编辑-执行计划信息
我区分了两个执行计划,并将尝试给你匿名的亮点
执行计划的MissingIndexes部分用于更快的子查询:
<QueryPlan CachedPlanSize="196" CompileTime="2166" CompileCPU="2166" CompileMemory="18640">
<MissingIndexes>
<MissingIndexGroup Impact="41.4663">
<MissingIndex Database="[database]" Schema="[dbo]" Table="[table2]">
<ColumnGroup Usage="EQUALITY">
<Column Name="[time]" ColumnId="3" />
<Column Name="[id]" ColumnId="10" />
</ColumnGroup>
</MissingIndex>
</MissingIndexGroup>
</MissingIndexes>
然后串
Table 'table2'. Scan count 1, logical reads 2339848, physical reads 0, read-ahead reads 2303, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'table3'. Scan count 1016, logical reads 4467, physical reads 21, read-ahead reads 1047, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 659, logical reads 5863, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'table4'. Scan count 1, logical reads 126, physical reads 0, read-ahead reads 126, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'table1'. Scan count 1033, logical reads 5228, physical reads 60, read-ahead reads 120, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'table5'. Scan count 1, logical reads 219, physical reads 0, read-ahead reads 219, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'activeYear'. Scan count 1, logical reads 2, physical reads 2, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'table6'. Scan count 1, logical reads 2, physical reads 2, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 16719 ms, elapsed time = 17447 ms.
这个连接中的表比简化查询中的多几个,但我关心的只是单个连接上的字符串与子查询的变化。。。希望这是独立的,因为t1单独连接到这个查询中的所有表。
猜测:中的第一个查询已转换为联接:
SELECT * FROM table1 t1
JOIN table2 t2 ON t1.id = t2.id
JOIN activeYears ay ON t2.time = ay.year
WHERE ay.active = 1
当二维查询的中转换为或甚至联合时:
SELECT * FROM table1 t1
JOIN table2 t2 ON t1.id = t2.id
WHERE t2.time = '2009'
OR t2.time = '2010'
推测:中的第一个查询已转换为联接:
SELECT * FROM table1 t1
JOIN table2 t2 ON t1.id = t2.id
JOIN activeYears ay ON t2.time = ay.year
WHERE ay.active = 1
当二维查询的中转换为或甚至联合时:
SELECT * FROM table1 t1
JOIN table2 t2 ON t1.id = t2.id
WHERE t2.time = '2009'
OR t2.time = '2010'
你比较过计划了吗?您是否在“设置统计信息IO”和“设置统计信息时间”的情况下进行了测试?我猜第二种方法会欺骗查询优化器,让它从t2开始。时间会让它变小。第一个看起来像是从t1开始,然后连接到t2,因为在子查询中通常更昂贵。可能有很多原因,请检查查询计划以获得提示。一种猜测是前者可以使用后者不能使用的索引。如果执行,您会得到什么:从表1 t1选择*在t1.id=t2.id上连接表2 t2,其中t2.time在'2009','2010'@Mitch中,这样做要快得多。事实上,当我这样做的时候,它会反转。在'2009'和'2010'中放置t2.time的位置成本稍高,但运行速度更快。在active=1的activeYears中放置所选年份中的t2.time的成本稍低,但返回结果的速度也较慢。@Richard I按照您的建议添加了一些统计信息。我不确定工作台是什么,但它似乎是最大的区别,除非我遗漏了更重要的东西?你比较过计划了吗?您是否在“设置统计信息IO”和“设置统计信息时间”的情况下进行了测试?我猜第二种方法会欺骗查询优化器,让它从t2开始。时间会让它变小。第一个看起来像是从t1开始,然后连接到t2,因为在子查询中通常更昂贵。可能有很多原因,请检查查询计划以获得提示。一种猜测是前者可以使用后者不能使用的索引。如果执行,您会得到什么:从表1 t1选择*在t1.id=t2.id上连接表2 t2,其中t2.time在'2009','2010'@Mitch中,这样做要快得多。事实上,当我这样做的时候,它会反转。在'2009'和'2010'中放置t2.time的位置成本稍高,但运行速度更快。在active=1的activeYears中放置所选年份中的t2.time的成本稍低,但返回结果的速度也较慢。@Richard I按照您的建议添加了一些统计信息。我不确定worktable是什么,但它似乎是最大的区别,除非我遗漏了更重要的东西?考虑到那些ScalarOperator标记,这是有意义的,我看到了OR的使用。但在更快的查询中,它表示[t2].[time]=N'2009'和[t2].[time]=N'2010'。。。它不可能知道这是那里仅有的两个可能的值,我只能假设它正在使用一个索引来知道这两个nvarchar之间没有会被错误的查询获取的值?我可能会得出这样的结论:它只知道在2009年和2010年之间没有什么,但实际的字段与此略有不同,甚至没有数字。更像是1982-1985年和1986-2000年。考虑到那些ScalarOperator标签,这是有意义的,我看到使用了OR。但在更快的查询中,它表示[t2].[time]=N'2009'和[t2].[time]=N'2010'。。。它不可能知道这是那里仅有的两个可能的值,我只能假设它正在使用一个索引来知道这两个nvarchar之间没有会被错误的查询获取的值?我可能会得出这样的结论:它只知道在2009年和2010年之间没有什么,但实际的字段与此略有不同,甚至没有数字。更像1982-1985年和1986-2000年。
Table 'activeYear'. Scan count 2, logical reads 2010, physical reads 2, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'table2'. Scan count 1, logical reads 2339848, physical reads 0, read-ahead reads 2303, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'table3'. Scan count 1016, logical reads 4624, physical reads 21, read-ahead reads 1047, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 12, logical reads 109, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'table4'. Scan count 1, logical reads 126, physical reads 0, read-ahead reads 126, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'table1'. Scan count 1033, logical reads 5331, physical reads 57, read-ahead reads 123, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'table5'. Scan count 1, logical reads 219, physical reads 0, read-ahead reads 219, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'table6'. Scan count 1, logical reads 2, physical reads 2, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 10328 ms, elapsed time = 11479 ms.
Table 'table2'. Scan count 1, logical reads 2339848, physical reads 0, read-ahead reads 2303, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'table3'. Scan count 1016, logical reads 4467, physical reads 21, read-ahead reads 1047, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 659, logical reads 5863, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'table4'. Scan count 1, logical reads 126, physical reads 0, read-ahead reads 126, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'table1'. Scan count 1033, logical reads 5228, physical reads 60, read-ahead reads 120, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'table5'. Scan count 1, logical reads 219, physical reads 0, read-ahead reads 219, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'activeYear'. Scan count 1, logical reads 2, physical reads 2, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'table6'. Scan count 1, logical reads 2, physical reads 2, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 16719 ms, elapsed time = 17447 ms.
SELECT * FROM table1 t1
JOIN table2 t2 ON t1.id = t2.id
JOIN activeYears ay ON t2.time = ay.year
WHERE ay.active = 1
SELECT * FROM table1 t1
JOIN table2 t2 ON t1.id = t2.id
WHERE t2.time = '2009'
OR t2.time = '2010'