比较SQL Server中由特殊字符分隔的值
我有两列,比如可乐和可乐,数据如下:比较SQL Server中由特殊字符分隔的值,sql,sql-server-2012,Sql,Sql Server 2012,我有两列,比如可乐和可乐,数据如下: COLA | COLB ------------------+------------------ PLATE|SPOON|GLASS | PLATE|GLASS|SPOON PLATE | SPOON OIL|JUG|MAT | JUG|MAT SPOON | SPOON OIL|MAT | MAT|OIL 我尝试返回不匹配的行,而不考虑顺序 预期产
COLA | COLB
------------------+------------------
PLATE|SPOON|GLASS | PLATE|GLASS|SPOON
PLATE | SPOON
OIL|JUG|MAT | JUG|MAT
SPOON | SPOON
OIL|MAT | MAT|OIL
我尝试返回不匹配的行,而不考虑顺序
预期产出:
我试过下面这样的方法和很多方法,但都不奏效。我不太了解SQL部分:
SELECT *
FROM MYTABLE
WHERE COLA NOT LIKE '%COLB%'
一种方法是递归子查询:
with cte as (
select convert(varchar(max), null) as parta,
convert(varchar(max), cola) as resta,
cola, colb,
row_number() over (order by (select null)) as seqnum
from t
union all
select convert(varchar(max),
left(resta, charindex('|', resta + '|') - 1)
) as parta,
convert(varchar(max),
stuff(resta, 1, charindex('|', resta + '|'), '')
) as resta,
cola, colb, seqnum
from cte
where resta <> ''
)
select cola, colb
from cte
where parta is not null
group by seqnum, cola, colb
having sum(case when concat('|', colb, '|') like concat('%|', parta, '|%') then 1 else 0 end) <> count(*) or
len(cola) <> len(colb);
他是一把小提琴
在支持字符串拆分和聚合的较新版本的SQL Server中,这要简单得多。您可以使用用户定义的函数拆分每列中的分隔字符串,然后比较该函数的结果 在2016年之前,我选择使用SQL Server中最快的字符串拆分函数之一,该函数具有内置的字符串拆分功能—Jeff Moden的DelimitedSplit8K。你可以在他写的一篇文章中读到这一切 首先,创建并填充样本表。请在以后的问题中为我们保存此步骤:
DECLARE @T AS TABLE (
ColA varchar(100),
ColB varchar(100)
);
INSERT INTO @T (ColA, ColB) VALUES
('PLATE|SPOON|GLASS', 'PLATE|GLASS|SPOON'),
('PLATE', 'SPOON'),
('OIL|JUG|MAT', 'JUG|MAT'),
('SPOON', 'SPOON'),
('OIL|MAT', 'MAT|OIL');
查询:
SELECT ColA, ColB
FROM @T
WHERE EXISTS (
SELECT Item FROM [dbo].[DelimitedSplit8K](ColA, '|')
EXCEPT
SELECT Item FROM [dbo].[DelimitedSplit8K](ColB, '|')
)
OR
EXISTS (
SELECT Item FROM [dbo].[DelimitedSplit8K](ColB, '|')
EXCEPT
SELECT Item FROM [dbo].[DelimitedSplit8K](ColA, '|')
)
结果:
ColA ColB
PLATE SPOON
OIL|JUG|MAT JUG|MAT
下面是一个依赖于使用xml函数将逗号sep字符串拆分为行的方法。然后比较cola和colb中的值,并重复差异
with data2
as (select row_number() over(order by (select null)) as rnk ,cola,colb
from t
)
,combo_data
as(
SELECT a.rnk
,a.cola
,a.colb
,Split.a.value('.', 'NVARCHAR(max)') AS Data
,1 as a_flag
,null as b_flag
FROM ( SELECT rnk
,cola
,colb
,CAST ('<M>' + REPLACE(cola, '|', '</M><M>') + '</M>' AS XML) AS Data
FROM data2
) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a)
union all
SELECT a.rnk
,a.cola
,a.colb
,Split.a.value('.', 'NVARCHAR(max)') AS Data
,null as a_flag
,1 as b_flag
FROM ( SELECT rnk
,cola
,colb
,CAST ('<M>' + REPLACE(colb, '|', '</M><M>') + '</M>' AS XML) AS Data
FROM data2
) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a)
)
select rnk,cola,colb,data,count(a_flag) as present_in_cola,count(b_flag) as present_in_colb
from combo_data
group by rnk,cola,colb,data
having count(a_flag) <> count(b_flag)
order by 1,2,3,4
+-----+-------------+---------+-------+-----------------+-----------------+
| rnk | cola | colb | data | present_in_cola | present_in_colb |
+-----+-------------+---------+-------+-----------------+-----------------+
| 2 | PLATE | SPOON | PLATE | 1 | 0 |
| 2 | PLATE | SPOON | SPOON | 0 | 1 |
| 3 | OIL|JUG|MAT | JUG|MAT | OIL | 1 | 0 |
+-----+-------------+---------+-------+-----------------+-----------------+
小提琴链接
使用交叉应用程序将两个字符串转换为XML类型
然后可以在EXISTS子句中比较这些XML的节点值
样本数据:
查询:
您提到您对SQL没有太多的背景知识。事实证明,当前的数据模型(在每个单元格中存储管道分隔的值)由于许多原因不可取。理想情况下,应该将每个值放在单独的行/记录上。“我甚至不敢回答您当前的问题,但这将非常难看。”TimBiegeleisen,不幸的是,我没有任何权限更改数据模型:我将把这个建议提升到更高的层次。但现在如果你能给我一些方法来实现这一点,那就太好了。你的表有某种ID列吗?如果没有字符串分割函数,这将非常困难?用一个解决方案就容易多了。太好了!非常感谢@Gordan Linoff提供的解决方案。我们计划很快升级SQL server。希望我们能很快完成。哇。这看起来很简单。非常感谢你。虽然我在理解XML和节点方面有点困难,但它解决了我的问题。我将探索获取这方面的知识。如果您升级,可以使用额外的解决方案。额外的解决方案似乎比以前的更简单。也谢谢你。从这个角度来看,我想2012年和2016年的版本有很大的不同。我们肯定会很快计划升级。@Omega他们已经有2019版了。但跳过2016年。2017版有一些有用的东西,f.e.谢谢你的努力。然而,它给了我比预期更多的行。是的,它会给你差异,所以它会给出colb中不存在的cola值,反之亦然。如果您不想这样做,那么您可以从group by中删除数据列,并选择clauseThanks a lot@Zohar Peled。现在,我没有创建此函数的权限。我会要求访问,并尝试这一点,我相信这将是最容易的一个。
with data2
as (select row_number() over(order by (select null)) as rnk ,cola,colb
from t
)
,combo_data
as(
SELECT a.rnk
,a.cola
,a.colb
,Split.a.value('.', 'NVARCHAR(max)') AS Data
,1 as a_flag
,null as b_flag
FROM ( SELECT rnk
,cola
,colb
,CAST ('<M>' + REPLACE(cola, '|', '</M><M>') + '</M>' AS XML) AS Data
FROM data2
) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a)
union all
SELECT a.rnk
,a.cola
,a.colb
,Split.a.value('.', 'NVARCHAR(max)') AS Data
,null as a_flag
,1 as b_flag
FROM ( SELECT rnk
,cola
,colb
,CAST ('<M>' + REPLACE(colb, '|', '</M><M>') + '</M>' AS XML) AS Data
FROM data2
) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a)
)
select rnk,cola,colb,data,count(a_flag) as present_in_cola,count(b_flag) as present_in_colb
from combo_data
group by rnk,cola,colb,data
having count(a_flag) <> count(b_flag)
order by 1,2,3,4
+-----+-------------+---------+-------+-----------------+-----------------+
| rnk | cola | colb | data | present_in_cola | present_in_colb |
+-----+-------------+---------+-------+-----------------+-----------------+
| 2 | PLATE | SPOON | PLATE | 1 | 0 |
| 2 | PLATE | SPOON | SPOON | 0 | 1 |
| 3 | OIL|JUG|MAT | JUG|MAT | OIL | 1 | 0 |
+-----+-------------+---------+-------+-----------------+-----------------+
CREATE TABLE YourTable
(
ID INT IDENTITY(1,1) PRIMARY KEY,
ColA NVARCHAR(100),
ColB NVARCHAR(100)
);
INSERT INTO YourTable (ColA, ColB) VALUES
('PLATE|SPOON|GLASS', 'PLATE|GLASS|SPOON')
, ('PLATE', 'SPOON')
, ('OIL|JUG|MAT', 'JUG|MAT')
, ('SPOON', 'SPOON')
, ('OIL|MAT', 'MAT|OIL');
GO
SELECT t.*
FROM YourTable t
CROSS APPLY
(
SELECT
CAST('<a>'+REPLACE(t.ColA,'|','</a><a>')+'</a>' AS XML) AS XmlA,
CAST('<b>'+REPLACE(t.ColB,'|','</b><b>')+'</b>' AS XML) AS XmlB
) caX
WHERE EXISTS
(
SELECT 1
FROM
(
(
SELECT a.val.value('.','nvarchar(100)') AS val
FROM caX.XmlA.nodes('/a') AS a(val)
EXCEPT
SELECT b.val.value('.','nvarchar(100)') AS val
FROM caX.XmlB.nodes('/b') AS b(val)
)
UNION ALL
(
SELECT b.val.value('.','nvarchar(100)') AS val
FROM caX.XmlB.nodes('/b') AS b(val)
EXCEPT
SELECT a.val.value('.','nvarchar(100)') AS val
FROM caX.XmlA.nodes('/a') AS a(val)
)
) q
);
ID | ColA | ColB
-: | :---------- | :------
2 | PLATE | SPOON
3 | OIL|JUG|MAT | JUG|MAT
SELECT t.*
FROM YourTable t
WHERE EXISTS
(
SELECT 1
FROM STRING_SPLIT(ColA,'|') a
FULL JOIN STRING_SPLIT(ColB,'|') b
ON a.value = b.value
WHERE a.value IS NULL
OR b.value IS NULL
);