使用日期时间计算的SQL查询效率低下。如何优化？_Sql_Sql Server

使用日期时间计算的SQL查询效率低下。如何优化？

sql sql-server

使用日期时间计算的SQL查询效率低下。如何优化？,sql,sql-server,Sql,Sql Server,问题来自实际环境，生产计划表捕获了每行中的订单标识和其他详细信息。每一行在产品开始生产时和生产后更新，以捕获事件的UTC时间有一个单独的表temperatures，用于收集生产线上的多个温度——定期、独立地存储在UTC中目标是提取每个产品生产的测量温度序列。（然后应对数据进行处理，创建值图表，并将其附加到产品项文档中，以供审计之用。）在marc_的评论后更新。原来的问题没有考虑任何指标。更新后的文本考虑了以下内容。评论中提到的原始测量表和索引是通过以下方式创建的： CREATE TABL

问题来自实际环境，

生产计划

表捕获了每行中的订单标识和其他详细信息。每一行在产品开始生产时和生产后更新，以捕获事件的UTC时间

有一个单独的表

temperatures

，用于收集生产线上的多个温度——定期、独立地存储在UTC中

目标是提取每个产品生产的测量温度序列。（然后应对数据进行处理，创建值图表，并将其附加到产品项文档中，以供审计之用。）

在marc_的评论后更新。原来的问题没有考虑任何指标。更新后的文本考虑了以下内容。评论中提到的原始测量
表和索引是通过以下方式创建的：

CREATE TABLE production_plan ( order_id nvarchar(50) NOT NULL, production_line uniqueidentifier NULL, prod_start DATETIME NULL, prod_end DATETIME NULL ); -- About 31 000 rows inserted, ordered by order_id. ... -- Clusteded index on ind_order_id. CREATE CLUSTERED INDEX ind_order_id ON production_plan (order_id ASC); -- Non-clustered indices on the other columns. CREATE INDEX ind_times ON production_plan (production_line ASC, prod_start ASC, prod_end ASC); ------------------------------------------------------ -- There is actually more temperatures for one time (i.e. more -- sensors). The UTC is the real time of the row insertion, hence -- the primary key. CREATE TABLE temperatures ( UTC datetime PRIMARY KEY NOT NULL, production_line uniqueidentifier NULL, temperature_1 float NULL ); -- About 91 000 rows inserted ordered by UTC. ... -- Clusteded index on UTC is created automatically -- because of the PRIMARY KEY. Indices on temperature(s) -- do not make sense. -- Non-clustered index for production_line CREATE INDEX ind_pl ON temperatures (production_line ASC); -- The tables were created, records inserted, and the indices -- created for less than 1 second (for the sample on my computer).
我们的想法是首先在
生产线
标识上加入表格，然后使温度UTC时间适合于产品开始/结束的UTC时间：

-- About 45 000 rows in about 24 seconds when no indices were used. -- The same took less than one second with the indices (for my data -- and my computer). SELECT pp.order_id, -- not related to the problem pp.prod_start, -- UTC of the start of production pp.prod_end, -- UTC of the end of production t.UTC, -- UTC of the temperature measurement t.temperature_1 -- the measured temperature INTO result_table02 FROM production_plan AS pp JOIN temperatures AS t ON pp.production_line = t.production_line AND t.UTC BETWEEN pp.prod_start AND pp.prod_end ORDER BY t.UTC;
大约24秒的时间是不可接受的。显然，索引是必要的。相同的操作花费了不到1秒的时间（Microsoft SQL Management Studio中结果选项卡下方黄线中的时间）
然而
第二个问题仍然存在
由于温度测量不太频繁，并且由于测量位置从生产开始时起在时间上有点偏移，因此必须进行时间校正。换句话说，必须向时间范围边界添加两个偏移。我以这样的提问结束：

-- About 46 000 rows in about 9 minutes without indices. -- It took about the same also with indices -- (8:50 instead of 9:00 or so). DECLARE @offset_start INT; SET @offset_start = -60 -- one minute = one sample before DECLARE @offset_end INT; SET @offset_end = +60 -- one minute = one sample after SELECT pp.order_id, -- not related to the problem pp.prod_start, -- UTC of the start of production pp.prod_end, -- UTC of the end of production t.UTC, -- UTC of the temperature measurement t.temperature_1 -- the measured temperature INTO result_table03 FROM production_plan AS pp JOIN temperatures AS t ON pp.production_line = t.production_line AND t.UTC BETWEEN DATEADD(second, @offset_start, pp.prod_start) AND DATEADD(second, @offset_end, pp.prod_end) ORDER BY t.UTC;
使用
DATEADD（）
计算大约需要9分钟——几乎与是否创建索引无关
在我看来，更多地考虑如何解决这个问题时，纠正的时间边界（添加了偏移量的UTC）需要自己的索引来实现高效处理。我想到了创建一个临时表。然后可以为其更正的列创建索引。之后再使用一个JOIN应该会有所帮助。然后桌子就可以放下了
临时表的基本概念正确吗？有没有其他技术可以做到这一点

谢谢你的建议。在引入您建议的索引后，我将更新时间结果。请解释预期改善的原因。我是编写SQL解决方案的新手。您通常可以通过以下方式优化查询：

在您的表上选择一个好的集群键-好的是
狭窄、唯一、静态、不断增加的
INT-IDENTITY
是一个经典的好键-GUID是一个非常糟糕的例子（因为它们会导致过度的索引碎片-请阅读Kim Tripp的更多详细信息）

确保对子表中的所有外键列进行了索引，以便更快地执行联接和查找

选择您真正需要的尽可能少的列（您似乎做得很好）

尝试覆盖查询，例如，在包含所有必需列的表上创建索引-直接作为索引列或作为包含列（SQL Server 2008及以后版本）

可能会添加其他索引以加快范围查询，和/或帮助排序/排序

查看查询和表定义：

我似乎没有看到任何主键-添加它们

您必须确保在
pp.production\u line
上有外键索引（假设
t.production\u line
是另一个表的主键）

您应该看看是否可以找到一个好的索引来处理
t.UTC

您应该检查在
生产计划2
上创建包含所有列的索引是否合理（
订单id，pp.prod\u开始，pp.prod\u结束
）

您应该检查在
temperatures2
上创建索引以包含所有列（
UTC，temperature\u 1
）是否有意义

更新：您可以通过从SSMS工具栏启用该选项来捕获实际执行计划：

或者从
Query>包含实际执行计划下的菜单中，您通常可以通过以下方式优化查询：在您的表上选择一个好的集群键-好的是狭窄、唯一、静态、不断增加的 INT-IDENTITY 是一个经典的好键-GUID是一个非常糟糕的例子（因为它们会导致过度的索引碎片-请阅读Kim Tripp的更多详细信息）确保对子表中的所有外键列进行了索引，以便更快地执行联接和查找选择您真正需要的尽可能少的列（您似乎做得很好）尝试覆盖查询，例如，在包含所有必需列的表上创建索引-直接作为索引列或作为包含列（SQL Server 2008及以后版本）可能会添加其他索引以加快范围查询，和/或帮助排序/排序查看查询和表定义：我似乎没有看到任何主键-添加它们您必须确保在pp.production\u line 上有外键索引（假设t.production\u line 是另一个表的主键）您应该看看是否可以找到一个好的索引来处理t.UTC select getdate()+1.000/(24.00*60.00)
select getdate()+0.000694444 CREATE INDEX ind_pl ON temperatures (production_line ASC, UTC); SELECT pp.order_id, -- not related to the problem pp.prod_start, -- UTC of the start of production pp.prod_end, -- UTC of the end of production t.UTC, -- UTC of the temperature measurement t.temperature_1 -- the measured temperature INTO result_table02 FROM production_plan AS pp CROSS APPLY ( SELECT t1.utc, t1.temperature_1 FROM temperatures AS t1 WHERE t1.production_line = pp.production_line AND t1.UTC BETWEEN DATEADD(second, @offset_start, pp.prod_start) AND DATEADD(second, @offset_end, pp.prod_end) ) t ORDER BY t.UTC; -- UTC range expanded by the offsets -- temporary table used. -- (Much better -- less than one second.) DECLARE @offset_start INT; SET @offset_start = -60 -- one minute = one sample before DECLARE @offset_end INT; SET @offset_end = +60 -- one minute = one sample after -- Temporary table with the production_plan UTC range expanded. SELECT production_line, order_id, prod_start, prod_end, DATEADD(second, @offset_start, prod_start) AS start, DATEADD(second, @offset_end, prod_end) AS bend INTO #pp FROM production_plan; CREATE INDEX ind_UTC ON #pp (production_line ASC, start ASC, bend ASC); SELECT order_id, prod_start, prod_end, UTC, temperature_1 INTO result_table06 FROM #pp JOIN temperatures AS t ON #pp.production_line = t.production_line AND UTC BETWEEN #pp.start AND #pp.bend ORDER BY UTC; DROP TABLE #pp; CREATE CLUSTERED INDEX ind_UTC ON result_table06 (UTC ASC); ALTER TABLE production_plan ADD offset_start int NOT NULL CONSTRAINT DF__production_plan__offset_start DEFAULT 0, offset_end int NOT NULL CONSTRAINT DF__production_plan__offset_end DEFAULT 0, prod_start_UTC as CAST(DATEADD(second,offset_start,prod_start) as DATETIME) PERSISTED NOT NULL , prod_end_UTC as CAST(DATEADD(second,offset_end,prod_end) as DATETIME) PERSISTED NOT NULL -- or just --ALTER TABLE production_plan ADD -- prod_start_UTC as CAST(DATEADD(second,-60,prod_start) as DATETIME) PERSISTED NOT NULL , -- prod_end_UTC as CAST(DATEADD(second,60,prod_end) as DATETIME) PERSISTED NOT NULL IF EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[temperatures]') AND name = N'ind_pl') DROP INDEX [ind_pl] ON [dbo].[temperatures] WITH ( ONLINE = OFF ) CREATE INDEX ind_times_UTC ON production_plan (production_line ASC, prod_start_UTC ASC, prod_end_UTC ASC); SELECT pp.order_id, -- not related to the problem pp.prod_start, -- UTC of the start of production pp.prod_end, -- UTC of the end of production t.UTC, -- UTC of the temperature measurement t.temperature_1 -- the measured temperature INTO result_table05 FROM production_plan AS pp JOIN temperatures AS t ON pp.production_line = t.production_line AND t.UTC BETWEEN pp.prod_start_UTC AND pp.prod_end_UTC ORDER BY t.UTC;