Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/sql-server/22.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Sql 检查存储过程插入中是否存在重复项_Sql_Sql Server_Stored Procedures - Fatal编程技术网

Sql 检查存储过程插入中是否存在重复项

Sql 检查存储过程插入中是否存在重复项,sql,sql-server,stored-procedures,Sql,Sql Server,Stored Procedures,我正在尝试编写一个插入数据的存储过程,但使用一些相当简单的检查,这似乎是一种很好的做法 该表目前有300列,其中有一个连续的主键id,一个我们要在插入前检查的列,比如地址,一个在有新数据(我们正在插入的内容)时使用的列的子列,然后是剩余的297列 那么让我们假设该表当前看起来是这样的: ---------------------------------------------------------------------- |PK |Address |child_

我正在尝试编写一个插入数据的存储过程,但使用一些相当简单的检查,这似乎是一种很好的做法

该表目前有300列,其中有一个连续的
主键id
,一个我们要在插入前检查的列,比如
地址
,一个在有新数据(我们正在插入的内容)时使用的
列的子列,然后是剩余的297列

那么让我们假设该表当前看起来是这样的:

----------------------------------------------------------------------
|PK    |Address             |child_of    |other_attr_1|other_attr2|...
----------------------------------------------------------------------
|1     | 123 Main St        |NULL        |...         |...        |...
|2     | 234 South Rd       |NULL        |...         |...        |...
|3     | 345 West Rd        |NULL        |...         |...        |...
----------------------------------------------------------------------
我们想添加这一行,地址在
other\u attr\u 1
列中有一个新属性
new
。我们将使用
子项来引用上一行记录的
主键id
。这将允许一个基本的历史(我希望)

如何检查存储过程中的重复?我是否使用数据库中已有的参数(如果存在)迭代每个输入参数

以下是迄今为止我掌握的代码:

USE [databaseINeed]
-- SET some_stuff ON --or off :)
-- ....
-- GO
CREATE Procedure [dbo].[insertNonDuplicatedData]
  @address text, @other_attr_1 numeric = NULL, @other_attr_2 numeric = NULL, @other_attr_3 numeric = NULL,....;
AS
BEGIN TRY
  -- If the address already exists, lets check for updated data
  IF EXISTS (SELECT 1 FROM tableName WHERE address = @address)  
    BEGIN 
      -- Look at the incoming data vs the data already in the record

      --HERE IS WHERE I THINK THE CODE SHOULD GO, WITH SOMETHING LIKE the following pseudocode:
      if any attribute parameter values is different than what is already stored
        then Insert into tableName (address, child_of, attrs) Values (@address, THE_PRIMARY_KEY_OF_THE_RECORD_THAT_SHARES_THE_ADDRESS, @other_attrs...)    

      RETURN
    END       
  -- We don't have any data like this, so lets create a new record altogther
  ELSE
    BEGIN
      -- Every time a SQL statement is executed it returns the number of rows that were affected.  By using "SET NOCOUNT ON" within your stored procedure you can shut off these messages and reduce some of the traffic.
      SET NOCOUNT ON
      INSERT INTO tableName (address, other_attr_1, other_attr_2, other_attr_3, ...)
      VALUES(@address,@other_attr_1,@other_attr_2,@other_attr_3,...)
    END
END TRY
BEGIN CATCH
  ...
END CATCH
我尝试在表本身上添加一个
约束
,用于检查
地址
列时需要唯一的所有297个属性,方法是:

ALTER TABLE tableName ADD CONSTRAINT
  uniqueAddressAttributes UNIQUE -- tried also with NONCLUSTERED
   (other_attr_1,other_attr_2,...) 
但我犯了个错误

错误:索引SQL状态中不能使用超过32列:54011


我想我可能走错了路,试图依靠唯一的约束

当然,拥有如此数量的列不是一个好的做法,无论如何,您可以尝试使用
INTERSECT
立即检查值

-- I assume you get the last id to set the 
-- THE_PRIMARY_KEY_OF_THE_RECORD_THAT_SHARES_THE_ADDRESS
DECLARE @PK int = (SELECT MAX(PK) FROM tableName WHERE address = @address)

-- No need for an EXISTS(), just check the @PK
IF @PK IS NOT NULL 
BEGIN

    IF EXISTS(
        -- List of attributes from table
        -- Possibly very poor performance to get the row by ntext
        SELECT other_attr_1, other_attr_2 ... FROM tableName WHERE PK = @PK
        INTERSECT
        -- List of attributes from variables
        SELECT @other_attr_1, @other_attr_2 ...
    )
    BEGIN
        Insert into tableName (address, child_of, attrs) Values 
        (@address, @PK, @other_attr_1, @other_attr_2 ...)   
    END

END

有很多列,您可以考虑在插入时对所有列进行哈希,然后将结果存储在(另一个)列中。在存储过程中,可以对输入参数执行相同的哈希,然后检查哈希冲突,而不是对所有这些字段进行逐字段比较

您可能需要进行一些数据转换,使您的300ish列全部为nvarchar,以便将它们连接起来输入函数。此外,如果任何列可能为空,则必须考虑如何对它们进行处理。例如,如果现有记录的字段216设置为NULL,并且尝试添加的行完全相同,但字段216是空字符串,那么这是否匹配

此外,对于如此多的列,连接可能会超过hashbytes函数的最大输入大小,因此您可能需要将其分解为多个较小块的散列


总而言之,您的架构真的需要这种300英寸的柱状结构吗?如果你能摆脱这一点,我就不必在这里太有创意了

我没有足够的代表发表评论,因此我将作为答案发布

Eric的SQL应该从
如果存在
更改为
如果不存在

我认为理想的逻辑应该是:

  • 如果存在现有地址记录,请检查是否有任何属性不同
  • 如果任何属性不同,请插入新的地址记录,并将最新现有地址记录的主键存储在列的子_中
  • 重构Chris&Eric的SQL:

    USE [databaseINeed]
    -- SET some_stuff ON --or off :)
    -- ....
    -- GO
    CREATE Procedure [dbo].[insertNonDuplicatedData]
      @address text, @other_attr_1 numeric = NULL, @other_attr_2 numeric = NULL, @other_attr_3 numeric = NULL,....;
    AS
    BEGIN TRY
      -- If the address already exists, lets check for updated data
      IF EXISTS (SELECT 1 FROM tableName WHERE address = @address)  
        BEGIN 
          -- Look at the incoming data vs the data already in the record
    
          --HERE IS WHERE I THINK THE CODE SHOULD GO, WITH SOMETHING LIKE the following pseudocode:
    
            DECLARE @PK int = (SELECT MAX(PK) FROM tableName WHERE address = @address)
            IF NOT EXISTS(
                -- List of attributes from table
                -- Possibly very poor performance to get the row by ntext
                SELECT other_attr_1, other_attr_2 ... FROM tableName WHERE PK = @PK
                INTERSECT
                -- List of attributes from variables
                SELECT @other_attr_1, @other_attr_2 ...
            )
            BEGIN
                -- @simplyink: existing address record has different combination of (297 column) attribute values
                --          at least one attribute column is different (no intersection)
                Insert into tableName (address, child_of, attrs) Values 
                (@address, @PK, @other_attr_1, @other_attr_2 ...)   
            END
    
    
          RETURN
        END       
      -- We don't have any data like this, so lets create a new record altogther
      ELSE
        BEGIN
          -- Every time a SQL statement is executed it returns the number of rows that were affected.  By using "SET NOCOUNT ON" within your stored procedure you can shut off these messages and reduce some of the traffic.
          SET NOCOUNT ON
          INSERT INTO tableName (address, other_attr_1, other_attr_2, other_attr_3, ...)
          VALUES(@address,@other_attr_1,@other_attr_2,@other_attr_3,...)
        END
    END TRY
    BEGIN CATCH
      ...
    END CATCH
    

    其中一个复杂之处是该表有297列。我很难理解为什么要构建一个包含这么多列的表,而不是对其进行规范化,这不仅是为了提高性能,也是为了简化数据处理。我劝你在你的设计中支持一点。当然,我没有接触到你的要求,我没有!我继承了它。但我正努力慢慢地但肯定地处理它:)人力资源经理,你能再帮我一点忙吗。我没有将
    @PK
    作为参数传入或比较它,因此TRANSFORM操作符的两个比较选项将不起作用。第一个选择可以是
    从@table_name中选择1,其中address=@address
    ,因为我正在传入@address。但是我需要从那一行获取
    pk
    。我应该附加
    和PK=@PK
    ?您的第二个选择使用了
    @pk
    ,这是第一个选择中的吗?你能不能提供一些更多的内联注释,说明这实际上是在做什么以及方法?每当我读到它时,我都离它越来越近,只是需要更多的帮助。@chrisFrisina请参见编辑。您应编辑获取插入PK的方式。
    USE [databaseINeed]
    -- SET some_stuff ON --or off :)
    -- ....
    -- GO
    CREATE Procedure [dbo].[insertNonDuplicatedData]
      @address text, @other_attr_1 numeric = NULL, @other_attr_2 numeric = NULL, @other_attr_3 numeric = NULL,....;
    AS
    BEGIN TRY
      -- If the address already exists, lets check for updated data
      IF EXISTS (SELECT 1 FROM tableName WHERE address = @address)  
        BEGIN 
          -- Look at the incoming data vs the data already in the record
    
          --HERE IS WHERE I THINK THE CODE SHOULD GO, WITH SOMETHING LIKE the following pseudocode:
    
            DECLARE @PK int = (SELECT MAX(PK) FROM tableName WHERE address = @address)
            IF NOT EXISTS(
                -- List of attributes from table
                -- Possibly very poor performance to get the row by ntext
                SELECT other_attr_1, other_attr_2 ... FROM tableName WHERE PK = @PK
                INTERSECT
                -- List of attributes from variables
                SELECT @other_attr_1, @other_attr_2 ...
            )
            BEGIN
                -- @simplyink: existing address record has different combination of (297 column) attribute values
                --          at least one attribute column is different (no intersection)
                Insert into tableName (address, child_of, attrs) Values 
                (@address, @PK, @other_attr_1, @other_attr_2 ...)   
            END
    
    
          RETURN
        END       
      -- We don't have any data like this, so lets create a new record altogther
      ELSE
        BEGIN
          -- Every time a SQL statement is executed it returns the number of rows that were affected.  By using "SET NOCOUNT ON" within your stored procedure you can shut off these messages and reduce some of the traffic.
          SET NOCOUNT ON
          INSERT INTO tableName (address, other_attr_1, other_attr_2, other_attr_3, ...)
          VALUES(@address,@other_attr_1,@other_attr_2,@other_attr_3,...)
        END
    END TRY
    BEGIN CATCH
      ...
    END CATCH