处理excel文件（只有30000行）时MSSQL存储过程速度慢_Sql_Sql Server_Stored Procedures

处理excel文件（只有30000行）时MSSQL存储过程速度慢

sql sql-server stored-procedures

处理excel文件（只有30000行）时MSSQL存储过程速度慢,sql,sql-server,stored-procedures,Sql,Sql Server,Stored Procedures,我有一个带有iterface的web应用程序，用户可以在上面上传文件。excel文件中的数据将被收集、连接并传递给一种处理和返回数据的存储过程。存储过程的简要说明存储过程收集字符串，使用delimeter将其分解，并将其存储在临时变量表中另一个过程是通过temp表运行，在temp表中进行计数，通过比较每个字符串来查找精确匹配计数和近似匹配计数重新生成一个包含第一行中每行要比较的所有名称精确的匹配计数是在视图中找到eact字符串的位置，例如。。博比·博伦斯基使用频率为2的leven

我有一个带有iterface的web应用程序，用户可以在上面上传文件。excel文件中的数据将被收集、连接并传递给一种处理和返回数据的存储过程。存储过程的简要说明

存储过程收集字符串，使用delimeter将其分解，并将其存储在临时变量表中

另一个过程是通过temp表运行，在temp表中进行计数，通过比较每个字符串来查找精确匹配计数和近似匹配计数重新生成一个包含第一行中每行要比较的所有名称

精确的匹配计数是在视图中找到eact字符串的位置，例如。。博比·博伦斯基使用频率为2的levenshtein距离算法数据库函数进行近似匹配。 temo表格@temp1

结果名称、exactmatch计数和近似匹配计数存储在最终的临时表中

在最后一个临时表上运行select语句，将所有数据返回到应用程序

我的问题是，当我传递像和excel这样的巨大文件时，有27000个名称。从数据库处理和返回数据大约需要2个小时

我已经检查了应用程序所在的服务器和数据库所在的服务器。在应用服务器上。内存和cpu使用率均低于15% 在数据库服务器上。内存和cpu使用率也低于15%

我正在寻找关于我可以做哪些改进以加快流程的建议

下面是存储过程执行所有工作并将结果返回给web应用程序时的副本

CREATE PROCEDURE [dbo].[FindMatch]
    @fullname varchar(max),@frequency int,
    @delimeter varchar(max) AS    

    set @frequency = 2

    declare @transID bigint

    SELECT @transID = ABS(CAST(CAST(NEWID() AS VARBINARY(5)) AS Bigint)) 

    DECLARE @exactMatch int = 99
    DECLARE @approximateMatch int = 99
    declare @name varchar(50)
    DECLARE @TEMP1 TABLE (fullname varchar(max),approxMatch varchar(max), exactmatch varchar(max))

    DECLARE @ID varchar(max)

    --declare a temp table
     DECLARE @TEMP TABLE (ID int ,fullname varchar(max),approxMatch varchar(max), exactmatch varchar(max))
     --split and store the result in the @temp table
     insert into @TEMP (ID,fullname) select * from fnSplitTest(@fullname, @delimeter)

     --loop trough the @temp table
     WHILE EXISTS (SELECT ID FROM @TEMP)
     BEGIN
        SELECT Top 1 @ID = ID FROM @TEMP 
        select @name = fullname from @TEMP where id = @ID 


          --get the exact match count of the first row from the @temp table and so on until the loop ends
          select @exactMatch = count(1) from  getalldata where  replace(name,',','') COLLATE Latin1_general_CI_AI =  @name COLLATE Latin1_general_CI_AI

        --declare temp @TEMP3
        DECLARE @TEMP3 TABLE (name varchar(max))


        --insert into @temp 3 only the data that are similar to our search name so as not to loop over all the data in the view
        INSERT INTO @TEMP3(name) 
        select  name from getalldata where  SOUNDEX(name) LIKE SOUNDEX(@name) 

        --get the approximate count using the [DEMLEV] function. 
        --this function uses the Damerau levenshtein distance algorithm to calculate the distinct between the search string
        --and the names inserted into @temp3 above. Uses frequency 2 so as to eliminate all the others
        select @approximateMatch = count(1) from @TEMP3 where
        dbo.[DamLev](replace(name,',',''),@name,@frequency) <= @frequency and 
        dbo.[DamLev](replace(name,',',''),@name,@frequency) > 0  and name != @name


        --insert into @temp1 at end of every loop results
          insert into  @TEMP1 (fullname,approxMatch, exactmatch) values(@name,@approximateMatch,@exactMatch)
        insert into FileUploadNameInsert (name) values (@name + ' ' +cast(@approximateMatch as varchar) + ' ' + cast(@exactMatch as varchar) + ', ' + cast(@transID as varchar)  )
        DELETE FROM @TEMP WHERE ID= @ID
        delete from @TEMP3
    END

    --Return all the data stored in @temp3
    select fullname,exactmatch,approxMatch, @transID as transactionID from @TEMP1

GO

依我看,

使用Openrowset将记录直接读取到数据库的预定义、正确索引的表中

现在，使用预定义的存储过程在后端使用此表执行操作

30000行大约需要15分钟。

您可以转储fnSplitTest的代码吗？Excel工作表是csv或xls扩展，如何加载它？首先，您需要找到减慢应用程序速度的罪魁祸首。是导入CSV、将数据插入临时表还是处理记录。处理记录相当简单。我仍然会推荐cte，但这是下一步。你有没有看过你的执行计划，那些替换，你的函数和类似的语句，以及临时表看起来很吓人