Sql server T-SQL:当长度>;瓦查尔(最大)?
仅使用SQL Server 2008 R2(这将在存储过程中),如何确定XML类型的两个变量是否相等 以下是我想做的:Sql server T-SQL:当长度>;瓦查尔(最大)?,sql-server,xml,tsql,comparison,Sql Server,Xml,Tsql,Comparison,仅使用SQL Server 2008 R2(这将在存储过程中),如何确定XML类型的两个变量是否相等 以下是我想做的: DECLARE @XmlA XML DECLARE @XmlB XML SET @XmlA = '[Really long Xml value]' SET @XmlB = '[Really long Xml value]' IF @XmlA = @XmlB SELECT 'Matching Xml!' 但正如你可能知道的,它会返回: Msg 305,级别1
DECLARE @XmlA XML
DECLARE @XmlB XML
SET @XmlA = '[Really long Xml value]'
SET @XmlB = '[Really long Xml value]'
IF @XmlA = @XmlB
SELECT 'Matching Xml!'
但正如你可能知道的,它会返回:
Msg 305,级别16,状态1,第7行XML数据类型不能为
比较或排序,除非使用IS NULL运算符
我可以转换到
VarChar(MAX)
并进行比较,但这只比较前2MB。还有其他方法吗?您可以将字段强制转换为varbinary(max),对其进行散列并比较散列。但是,如果XML是等价的,但不是完全相同的,那么您肯定会错过
要计算哈希,可以使用CLR函数:
using System;
using System.Data.SqlTypes;
using System.IO;
namespace ClrHelpers
{
public partial class UserDefinedFunctions {
[Microsoft.SqlServer.Server.SqlFunction]
public static Guid HashMD5(SqlBytes data) {
System.Security.Cryptography.MD5CryptoServiceProvider md5 = new System.Security.Cryptography.MD5CryptoServiceProvider();
md5.Initialize();
int len = 0;
byte[] b = new byte[8192];
Stream s = data.Stream;
do {
len = s.Read(b, 0, 8192);
md5.TransformBlock(b, 0, len, b, 0);
} while(len > 0);
md5.TransformFinalBlock(b, 0, 0);
Guid g = new Guid(md5.Hash);
return g;
}
};
}
CREATE FUNCTION dbo.GetMyLongHash(@data VARBINARY(MAX))
RETURNS VARBINARY(MAX)
WITH RETURNS NULL ON NULL INPUT
AS
BEGIN
DECLARE @res VARBINARY(MAX) = 0x
DECLARE @position INT = 1, @len INT = DATALENGTH(@data)
WHILE 1 = 1
BEGIN
SET @res = @res + HASHBYTES('MD5', SUBSTRING(@data, @position, 8000))
SET @position = @position+8000
IF @Position > @len
BREAK
END
WHILE DATALENGTH(@res) > 16 SET @res= dbo.GetMyLongHash(@res)
RETURN @res
END
CREATE FUNCTION [dbo].[CompareXml]
(
@xml1 XML,
@xml2 XML
)
RETURNS INT
AS
BEGIN
DECLARE @ret INT
SELECT @ret = 0
-- -------------------------------------------------------------
-- If one of the arguments is NULL then we assume that they are
-- not equal.
-- -------------------------------------------------------------
IF @xml1 IS NULL OR @xml2 IS NULL
BEGIN
RETURN 1
END
-- -------------------------------------------------------------
-- Match the name of the elements
-- -------------------------------------------------------------
IF (SELECT @xml1.value('(local-name((/*)[1]))','VARCHAR(MAX)'))
<>
(SELECT @xml2.value('(local-name((/*)[1]))','VARCHAR(MAX)'))
BEGIN
RETURN 1
END
---------------------------------------------------------------
--Match the value of the elements
---------------------------------------------------------------
IF((@xml1.query('count(/*)').value('.','INT') = 1) AND (@xml2.query('count(/*)').value('.','INT') = 1))
BEGIN
DECLARE @elValue1 VARCHAR(MAX), @elValue2 VARCHAR(MAX)
SELECT
@elValue1 = @xml1.value('((/*)[1])','VARCHAR(MAX)'),
@elValue2 = @xml2.value('((/*)[1])','VARCHAR(MAX)')
IF @elValue1 <> @elValue2
BEGIN
RETURN 1
END
END
-- -------------------------------------------------------------
-- Match the number of attributes
-- -------------------------------------------------------------
DECLARE @attCnt1 INT, @attCnt2 INT
SELECT
@attCnt1 = @xml1.query('count(/*/@*)').value('.','INT'),
@attCnt2 = @xml2.query('count(/*/@*)').value('.','INT')
IF @attCnt1 <> @attCnt2 BEGIN
RETURN 1
END
-- -------------------------------------------------------------
-- Match the attributes of attributes
-- Here we need to run a loop over each attribute in the
-- first XML element and see if the same attribut exists
-- in the second element. If the attribute exists, we
-- need to check if the value is the same.
-- -------------------------------------------------------------
DECLARE @cnt INT, @cnt2 INT
DECLARE @attName VARCHAR(MAX)
DECLARE @attValue VARCHAR(MAX)
SELECT @cnt = 1
WHILE @cnt <= @attCnt1
BEGIN
SELECT @attName = NULL, @attValue = NULL
SELECT
@attName = @xml1.value(
'local-name((/*/@*[sql:variable("@cnt")])[1])',
'varchar(MAX)'),
@attValue = @xml1.value(
'(/*/@*[sql:variable("@cnt")])[1]',
'varchar(MAX)')
-- check if the attribute exists in the other XML document
IF @xml2.exist(
'(/*/@*[local-name()=sql:variable("@attName")])[1]'
) = 0
BEGIN
RETURN 1
END
IF @xml2.value(
'(/*/@*[local-name()=sql:variable("@attName")])[1]',
'varchar(MAX)')
<>
@attValue
BEGIN
RETURN 1
END
SELECT @cnt = @cnt + 1
END
-- -------------------------------------------------------------
-- Match the number of child elements
-- -------------------------------------------------------------
DECLARE @elCnt1 INT, @elCnt2 INT
SELECT
@elCnt1 = @xml1.query('count(/*/*)').value('.','INT'),
@elCnt2 = @xml2.query('count(/*/*)').value('.','INT')
IF @elCnt1 <> @elCnt2
BEGIN
RETURN 1
END
-- -------------------------------------------------------------
-- Start recursion for each child element
-- -------------------------------------------------------------
SELECT @cnt = 1
SELECT @cnt2 = 1
DECLARE @x1 XML, @x2 XML
DECLARE @noMatch INT
WHILE @cnt <= @elCnt1
BEGIN
SELECT @x1 = @xml1.query('/*/*[sql:variable("@cnt")]')
--RETURN CONVERT(VARCHAR(MAX),@x1)
WHILE @cnt2 <= @elCnt2
BEGIN
SELECT @x2 = @xml2.query('/*/*[sql:variable("@cnt2")]')
SELECT @noMatch = dbo.CompareXml( @x1, @x2 )
IF @noMatch = 0 BREAK
SELECT @cnt2 = @cnt2 + 1
END
SELECT @cnt2 = 1
IF @noMatch = 1
BEGIN
RETURN 1
END
SELECT @cnt = @cnt + 1
END
RETURN @ret
END
或sql函数:
using System;
using System.Data.SqlTypes;
using System.IO;
namespace ClrHelpers
{
public partial class UserDefinedFunctions {
[Microsoft.SqlServer.Server.SqlFunction]
public static Guid HashMD5(SqlBytes data) {
System.Security.Cryptography.MD5CryptoServiceProvider md5 = new System.Security.Cryptography.MD5CryptoServiceProvider();
md5.Initialize();
int len = 0;
byte[] b = new byte[8192];
Stream s = data.Stream;
do {
len = s.Read(b, 0, 8192);
md5.TransformBlock(b, 0, len, b, 0);
} while(len > 0);
md5.TransformFinalBlock(b, 0, 0);
Guid g = new Guid(md5.Hash);
return g;
}
};
}
CREATE FUNCTION dbo.GetMyLongHash(@data VARBINARY(MAX))
RETURNS VARBINARY(MAX)
WITH RETURNS NULL ON NULL INPUT
AS
BEGIN
DECLARE @res VARBINARY(MAX) = 0x
DECLARE @position INT = 1, @len INT = DATALENGTH(@data)
WHILE 1 = 1
BEGIN
SET @res = @res + HASHBYTES('MD5', SUBSTRING(@data, @position, 8000))
SET @position = @position+8000
IF @Position > @len
BREAK
END
WHILE DATALENGTH(@res) > 16 SET @res= dbo.GetMyLongHash(@res)
RETURN @res
END
CREATE FUNCTION [dbo].[CompareXml]
(
@xml1 XML,
@xml2 XML
)
RETURNS INT
AS
BEGIN
DECLARE @ret INT
SELECT @ret = 0
-- -------------------------------------------------------------
-- If one of the arguments is NULL then we assume that they are
-- not equal.
-- -------------------------------------------------------------
IF @xml1 IS NULL OR @xml2 IS NULL
BEGIN
RETURN 1
END
-- -------------------------------------------------------------
-- Match the name of the elements
-- -------------------------------------------------------------
IF (SELECT @xml1.value('(local-name((/*)[1]))','VARCHAR(MAX)'))
<>
(SELECT @xml2.value('(local-name((/*)[1]))','VARCHAR(MAX)'))
BEGIN
RETURN 1
END
---------------------------------------------------------------
--Match the value of the elements
---------------------------------------------------------------
IF((@xml1.query('count(/*)').value('.','INT') = 1) AND (@xml2.query('count(/*)').value('.','INT') = 1))
BEGIN
DECLARE @elValue1 VARCHAR(MAX), @elValue2 VARCHAR(MAX)
SELECT
@elValue1 = @xml1.value('((/*)[1])','VARCHAR(MAX)'),
@elValue2 = @xml2.value('((/*)[1])','VARCHAR(MAX)')
IF @elValue1 <> @elValue2
BEGIN
RETURN 1
END
END
-- -------------------------------------------------------------
-- Match the number of attributes
-- -------------------------------------------------------------
DECLARE @attCnt1 INT, @attCnt2 INT
SELECT
@attCnt1 = @xml1.query('count(/*/@*)').value('.','INT'),
@attCnt2 = @xml2.query('count(/*/@*)').value('.','INT')
IF @attCnt1 <> @attCnt2 BEGIN
RETURN 1
END
-- -------------------------------------------------------------
-- Match the attributes of attributes
-- Here we need to run a loop over each attribute in the
-- first XML element and see if the same attribut exists
-- in the second element. If the attribute exists, we
-- need to check if the value is the same.
-- -------------------------------------------------------------
DECLARE @cnt INT, @cnt2 INT
DECLARE @attName VARCHAR(MAX)
DECLARE @attValue VARCHAR(MAX)
SELECT @cnt = 1
WHILE @cnt <= @attCnt1
BEGIN
SELECT @attName = NULL, @attValue = NULL
SELECT
@attName = @xml1.value(
'local-name((/*/@*[sql:variable("@cnt")])[1])',
'varchar(MAX)'),
@attValue = @xml1.value(
'(/*/@*[sql:variable("@cnt")])[1]',
'varchar(MAX)')
-- check if the attribute exists in the other XML document
IF @xml2.exist(
'(/*/@*[local-name()=sql:variable("@attName")])[1]'
) = 0
BEGIN
RETURN 1
END
IF @xml2.value(
'(/*/@*[local-name()=sql:variable("@attName")])[1]',
'varchar(MAX)')
<>
@attValue
BEGIN
RETURN 1
END
SELECT @cnt = @cnt + 1
END
-- -------------------------------------------------------------
-- Match the number of child elements
-- -------------------------------------------------------------
DECLARE @elCnt1 INT, @elCnt2 INT
SELECT
@elCnt1 = @xml1.query('count(/*/*)').value('.','INT'),
@elCnt2 = @xml2.query('count(/*/*)').value('.','INT')
IF @elCnt1 <> @elCnt2
BEGIN
RETURN 1
END
-- -------------------------------------------------------------
-- Start recursion for each child element
-- -------------------------------------------------------------
SELECT @cnt = 1
SELECT @cnt2 = 1
DECLARE @x1 XML, @x2 XML
DECLARE @noMatch INT
WHILE @cnt <= @elCnt1
BEGIN
SELECT @x1 = @xml1.query('/*/*[sql:variable("@cnt")]')
--RETURN CONVERT(VARCHAR(MAX),@x1)
WHILE @cnt2 <= @elCnt2
BEGIN
SELECT @x2 = @xml2.query('/*/*[sql:variable("@cnt2")]')
SELECT @noMatch = dbo.CompareXml( @x1, @x2 )
IF @noMatch = 0 BREAK
SELECT @cnt2 = @cnt2 + 1
END
SELECT @cnt2 = 1
IF @noMatch = 1
BEGIN
RETURN 1
END
SELECT @cnt = @cnt + 1
END
RETURN @ret
END
我偶然发现了一个更详细的方法,它实际上比较了两个XML条目的内容,以确定它们是否相同。这是有意义的,因为节点中属性的顺序可能不同,即使它们的值完全相同。我建议您通读它,甚至实现该功能,看看它是否适合您。。。我很快就试用了它,它似乎对我有用?比较两个XML文档有很多不同的方法,这很大程度上取决于您希望容忍的差异:您肯定需要容忍编码、属性顺序、不重要的空白、数字字符引用、,以及属性分隔符的使用,并且您可能还应该容忍注释、名称空间前缀和CDATA使用上的差异。因此,将两个XML文档作为字符串进行比较绝对不是一个好主意,除非首先调用XML规范化
出于许多目的,XQuery deep-equals()函数做了正确的事情(或多或少相当于比较两个XML文档的规范形式)。我对Microsoft的SQL Server XQuery实现了解不够,无法告诉您如何从SQL级别调用它。如果您可以使用SQL CLR,我建议您使用以下命令编写函数: 如果不能,可以编写自己的函数(请参阅):
CREATE function[dbo]。[udf\u XML\u等于]
(
@Data1 xml,
@数据2 xml
)
返回位
作为
开始
声明
@i bigint、@cnt1 bigint、@cnt2 bigint、,
@Sub_Data1 xml、@Sub_Data2 xml、,
@名称varchar(最大值)、@Value1-nvarchar(最大值)、@Value2-nvarchar(最大值)
如果@Data1为空或@Data2为空
返回1
--=========================================================================================================
--如果每个元素有多个根递归
--=========================================================================================================
选择
@cnt1=@Data1.query('count(/*)')).value('.','int'),
@cnt2=@Data1.query('count(/*))。value('.','int'))
如果@cnt1@cnt2
返回0
如果@cnt1>1
开始
选择@i=1
当@i检查此SQL函数时:
using System;
using System.Data.SqlTypes;
using System.IO;
namespace ClrHelpers
{
public partial class UserDefinedFunctions {
[Microsoft.SqlServer.Server.SqlFunction]
public static Guid HashMD5(SqlBytes data) {
System.Security.Cryptography.MD5CryptoServiceProvider md5 = new System.Security.Cryptography.MD5CryptoServiceProvider();
md5.Initialize();
int len = 0;
byte[] b = new byte[8192];
Stream s = data.Stream;
do {
len = s.Read(b, 0, 8192);
md5.TransformBlock(b, 0, len, b, 0);
} while(len > 0);
md5.TransformFinalBlock(b, 0, 0);
Guid g = new Guid(md5.Hash);
return g;
}
};
}
CREATE FUNCTION dbo.GetMyLongHash(@data VARBINARY(MAX))
RETURNS VARBINARY(MAX)
WITH RETURNS NULL ON NULL INPUT
AS
BEGIN
DECLARE @res VARBINARY(MAX) = 0x
DECLARE @position INT = 1, @len INT = DATALENGTH(@data)
WHILE 1 = 1
BEGIN
SET @res = @res + HASHBYTES('MD5', SUBSTRING(@data, @position, 8000))
SET @position = @position+8000
IF @Position > @len
BREAK
END
WHILE DATALENGTH(@res) > 16 SET @res= dbo.GetMyLongHash(@res)
RETURN @res
END
CREATE FUNCTION [dbo].[CompareXml]
(
@xml1 XML,
@xml2 XML
)
RETURNS INT
AS
BEGIN
DECLARE @ret INT
SELECT @ret = 0
-- -------------------------------------------------------------
-- If one of the arguments is NULL then we assume that they are
-- not equal.
-- -------------------------------------------------------------
IF @xml1 IS NULL OR @xml2 IS NULL
BEGIN
RETURN 1
END
-- -------------------------------------------------------------
-- Match the name of the elements
-- -------------------------------------------------------------
IF (SELECT @xml1.value('(local-name((/*)[1]))','VARCHAR(MAX)'))
<>
(SELECT @xml2.value('(local-name((/*)[1]))','VARCHAR(MAX)'))
BEGIN
RETURN 1
END
---------------------------------------------------------------
--Match the value of the elements
---------------------------------------------------------------
IF((@xml1.query('count(/*)').value('.','INT') = 1) AND (@xml2.query('count(/*)').value('.','INT') = 1))
BEGIN
DECLARE @elValue1 VARCHAR(MAX), @elValue2 VARCHAR(MAX)
SELECT
@elValue1 = @xml1.value('((/*)[1])','VARCHAR(MAX)'),
@elValue2 = @xml2.value('((/*)[1])','VARCHAR(MAX)')
IF @elValue1 <> @elValue2
BEGIN
RETURN 1
END
END
-- -------------------------------------------------------------
-- Match the number of attributes
-- -------------------------------------------------------------
DECLARE @attCnt1 INT, @attCnt2 INT
SELECT
@attCnt1 = @xml1.query('count(/*/@*)').value('.','INT'),
@attCnt2 = @xml2.query('count(/*/@*)').value('.','INT')
IF @attCnt1 <> @attCnt2 BEGIN
RETURN 1
END
-- -------------------------------------------------------------
-- Match the attributes of attributes
-- Here we need to run a loop over each attribute in the
-- first XML element and see if the same attribut exists
-- in the second element. If the attribute exists, we
-- need to check if the value is the same.
-- -------------------------------------------------------------
DECLARE @cnt INT, @cnt2 INT
DECLARE @attName VARCHAR(MAX)
DECLARE @attValue VARCHAR(MAX)
SELECT @cnt = 1
WHILE @cnt <= @attCnt1
BEGIN
SELECT @attName = NULL, @attValue = NULL
SELECT
@attName = @xml1.value(
'local-name((/*/@*[sql:variable("@cnt")])[1])',
'varchar(MAX)'),
@attValue = @xml1.value(
'(/*/@*[sql:variable("@cnt")])[1]',
'varchar(MAX)')
-- check if the attribute exists in the other XML document
IF @xml2.exist(
'(/*/@*[local-name()=sql:variable("@attName")])[1]'
) = 0
BEGIN
RETURN 1
END
IF @xml2.value(
'(/*/@*[local-name()=sql:variable("@attName")])[1]',
'varchar(MAX)')
<>
@attValue
BEGIN
RETURN 1
END
SELECT @cnt = @cnt + 1
END
-- -------------------------------------------------------------
-- Match the number of child elements
-- -------------------------------------------------------------
DECLARE @elCnt1 INT, @elCnt2 INT
SELECT
@elCnt1 = @xml1.query('count(/*/*)').value('.','INT'),
@elCnt2 = @xml2.query('count(/*/*)').value('.','INT')
IF @elCnt1 <> @elCnt2
BEGIN
RETURN 1
END
-- -------------------------------------------------------------
-- Start recursion for each child element
-- -------------------------------------------------------------
SELECT @cnt = 1
SELECT @cnt2 = 1
DECLARE @x1 XML, @x2 XML
DECLARE @noMatch INT
WHILE @cnt <= @elCnt1
BEGIN
SELECT @x1 = @xml1.query('/*/*[sql:variable("@cnt")]')
--RETURN CONVERT(VARCHAR(MAX),@x1)
WHILE @cnt2 <= @elCnt2
BEGIN
SELECT @x2 = @xml2.query('/*/*[sql:variable("@cnt2")]')
SELECT @noMatch = dbo.CompareXml( @x1, @x2 )
IF @noMatch = 0 BREAK
SELECT @cnt2 = @cnt2 + 1
END
SELECT @cnt2 = 1
IF @noMatch = 1
BEGIN
RETURN 1
END
SELECT @cnt = @cnt + 1
END
RETURN @ret
END
你是说第一个2GB吗?您的XML真的大于2MB吗?你真的需要知道它是否相同吗?如果是这样的话,最好在SQL Server之外进行比较。您希望的比较是否将
等同于
?VARCHAR(MAX)
最多可以存储2GB的数据(托尔斯泰战争与和平的150倍大小)-并且您的XML比这还要大!?!?!?不能存储与varchar(max)匹配的大于2GB的XML。但是,严格地说,如果XML大于2MB(或者更糟,大于2GB),则它应该是varbinary(max),我认为源XML中有很多细微的差异可能导致相同的散列值(冲突),错误地指示它们是相等的,而实际上它们不是相等的。我知道您在回答的开头提到了这一点,但我认为可以更明确地表达出来。@AaronBertrand 1st-您不能在sql server xml字段中存储大于2Gb的数据。第二,使用适当的哈希alg。(即md5,sha1)-你保证没有碰撞结果,对不起,我的意思是接近2GB,不超过2GB。我不太确定你能保证没有碰撞;几天前,John Huang似乎还演示了一个例子:他的冲突不是基于XML的,但它仍然是违反您的“保证”的一个例子……@AaronBertrand For MD5它可能但可以忽略不计,对于SHA1-仍然不可能-我认为这里的问题是,其中一个XML可以包含额外的空间,或者可以切换属性,在这种情况下,我认为哈希比较将失败。注意:对于XML片段,这将失败,例如,当没有单个根元素时,如SELECT dbo.CompareXml(“”)它似乎只是比较第一个节点,请不要发布您不理解的代码。它不返回布尔/位,而是返回整数。粗略地看一下代码,返回值0表示“它匹配”,值“1”表示“它不匹配”。请运行SELECT dbo.CompareXml(“”“”),dbo.CompareXml(“”“”),dbo.CompareXml(“”“”),dbo.CompareXml(“”“”),并检查结果。只有第三个返回'1'作为'Differentiti可以很好地采用和更改代码;我已经做了很长时间了。让我们直接来了解一些事实。你说“检查这个代码”,我说了,警告和评论它什么时候会失败。然后你否认了我的评论,说我错了,并且对代码的工作方式给出了错误的描述。然后我做了一个具体的测试来证明它的缺陷。然后删除不正确的注释,并继续建议我更改测试,以隐藏潜在问题!我建议你不要在你的能力范围之外。@布雷曼兹,我同意。请注意,该代码是几年前发布的,当时我正试图为我遇到的问题找到解决方案。我不记得这段代码到底是如何工作的。我将把它添加到答案中。我正在删除过时或只是垃圾邮件的评论。该链接不再可用