SQL Server:在一列中查找重复的子字符串

SQL Server:在一列中查找重复的子字符串,sql,sql-server,Sql,Sql Server,我在SQL Server中有一个客户机表。我试图找出EMAIL地址列中的重复,但是我只需要考虑列数据的一部分,所以子串。实际上,我需要在记录中找到重复的域名 我已经使用了下面的查询来查找精确的副本(在整个字段上),但是我如何修改它来考虑子字符串? SELECT a.email_address, b.dupeCount, a.client_id FROM tblClient a INNER JOIN ( SELECT email_address, COUNT(*) AS dupeCount

我在SQL Server中有一个客户机表。我试图找出EMAIL地址列中的重复,但是我只需要考虑列数据的一部分,所以子串。实际上,我需要在记录中找到重复的域名

我已经使用了下面的查询来查找精确的副本(在整个字段上),但是我如何修改它来考虑子字符串?

SELECT a.email_address, b.dupeCount, a.client_id
FROM tblClient a
INNER JOIN (
    SELECT email_address, COUNT(*) AS dupeCount
    FROM tblClient
    GROUP BY email_address
    HAVING COUNT(*) > 1
) b ON a.email_address = b.email_address
非常感谢

gee:

SELECT substr(email_address, 1, 2), count(*)
FROM tblClient 
group by 1
试试这个:

declare @contact table (
  [client_id] [int] identity(1, 1)
  , [email]   [sysname]
  );
insert into @contact
        ([email])
values      (N'joe@billy_bobs.com'),
        (N'sally@beauty.com'),
        (N'george@billy_bobs.com');
with [stripper]
 as (select [client_id]
            , [email]
            , substring([email]
                        , charindex(N'@', [email], 0) + 1
                        , len([email])) as [domain_name]
     from   @contact),
 [duplicate_finder]
 as (select [client_id]
            , [domain_name]
            , row_number()
                over (
                  partition by [domain_name]
                  order by [domain_name]) as [sequence]
     from   [stripper])
select from [duplicate_finder]
where  [sequence] > 1;

如果您已经怀疑需要使用子字符串,那么不妨尝试一下。对于您想要获取的数据,pivot可能会更好。请尝试加入电子邮件地址中匹配的子字符串。谢谢您的回复。实际上,我并不想删除重复记录,如果没有删除部分,我如何才能使其正常工作?Adam,我只更新了一条select语句以反映您的问题。您将如何修改此查询以获取与这些唯一子字符串关联的所有行?