
SQL字符串操作,查找所有排列,sql,sql-server,string,tsql,parsing,Sql,Sql Server,String,Tsql,Parsing,所以我在一列中有一组字符串,格式为“/a/B/C/D/E” A、 B、C、D、E表示长度不同的字符串。如何从上面的字符串转换成一组始终保持降序、按时间顺序排列的字符串。我的意思是A只能跟在B后面,B只能跟在C后面,而且必须跟在A前面。结果如下: 结果: '/A' '/A/B' '/A/B/C' '/A/B/C/D' '/A/B/C/D/E' txt ----------- /A /A/B /A/B/C /A/B/C/D /A/B/C/D/E txt ------------ /A /


A、 B、C、D、E表示长度不同的字符串。如何从上面的字符串转换成一组始终保持降序、按时间顺序排列的字符串。我的意思是A只能跟在B后面,B只能跟在C后面,而且必须跟在A前面。结果如下:








with x as (
      select '/A/B/C/D/E' as col
     cte as (
      select col
      from x
      union all
      select left(col, len(col) - charindex('/', reverse(col))) as col
      from cte
      where col like '/%/%'
select *
from cte;



declare @string varchar(100) = '/A/B/C/D/E';

with iTally(n) as 
( select top (len(@string)/2) (row_number() over (order by (select null))-1)*2+2
  from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) a(x),
       (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) b(x)) -- up to 100 character
select txt = substring(@string,1,n)
from iTally;






declare @string varchar(100) = '/B/D/A/E/C';

with iTally(n) as 
( select top (len(@string)/2) (row_number() over (order by (select null))-1)*2+2
  from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) a(x),
       (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) b(x)) -- up to 100 character
select txt = substring(reOrder.newString, 1, n)
from iTally
cross apply
  select '/'+substring(@string,n,1)
  from iTally
  order by substring(@string,n,1)
  for xml path('')
) reOrder(newString);








select itemNumber = tokenlen/2, leftToken
from dbo.edgeNgrams8k('/A/B/C/D/E')
where tokenlen % 2 = 0;
if object_id('dbo.edgeNgrams8k', 'IF') is not null drop function dbo.edgeNgrams8k;
create function dbo.edgeNgrams8k(@string varchar(8000))
  edgeNgrams8k is an inline table valued function (itvf) that accepts a varchar(8000) 
  input string (@string) and returns a series of character-level left and right edge 
  n-grams. An edge n-gram (referred to herin as an "edge-gram" for brevity) is a type of 
  n-gram (see https://en.wikipedia.org/wiki/N-gram). Instead of a contiguous series of 
  n-sized tokens (n-grams), however, an edge n-gram is a series of tokens that that begin 
  with the input string's first character then increases by one character, the next in the
  string, unitl the token is as long as the input string. 

  Left edge-grams start at the beginning of the string and grow from left-to-right. Right
  edge-grams begin at the end of the string and grow from right-to-left. Note this query
  and the result-set:

  select * from dbo.edgeNgrams8k('ABC');

  tokenlen   leftToken    rightTokenIndex  righttoken
  ---------- ------------ ---------------- ----------
  1          A            3                C
  2          AB           2                BC
  3          ABC          1                ABC

Developer Notes:
 1. For more about N-Grams in SQL Server see: http://www.sqlservercentral.com/articles/Tally+Table/142316/
    For more about Edge N-Grams see the documentation by Elastic here: https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-edgengram-tokenizer.html 

 2. dbo.edgeNgrams8k is deterministic. For more about determinism see: https://docs.microsoft.com/en-us/sql/relational-databases/user-defined-functions/deterministic-and-nondeterministic-functions

 3. If you need to sort this data without getting a sort in your execution plan you can 
    sort by tokenLen for ascending order, or by rightTokenIndex for descending order.

Usage Examples:
  I need to turn /A/B/C/D/E into:

  select leftToken 
    from dbo.edgeNgrams8k('/A/B/C/D/E')
  where tokenLen % 2 = 0

 20171125 - Initial Development - Developed by Alan Burstein  
returns table with schemabinding as return
with iTally(n) as 
  select top (len(@string)) row_number() over (order by (select $))
  from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) a(x), -- 10^1 = 10
       (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) b(x), -- 10^2 = 100
       (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) c(x), -- 10^3 = 1000
       (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) d(x)  -- 10^4 = 10000
select top (convert(bigint, len(@string), 0))
  tokenlen        = n,
  leftToken       = substring(@string,1,n),
  rightTokenIndex = len(@string)+1-n,
  righttoken      = substring(@string,len(@string)+1-n, n)
from itally;

itemNumber           leftToken
-------------------- -----------
1                    /A
2                    /A/B
3                    /A/B/C
4                    /A/B/C/D
5                    /A/B/C/D/E
Gordon (unsorted)
Table 'Worktable'. **Scan count 100001, logical reads 3492199,** physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#strings____________________________________________________________________________________________________________00000000004C'. 
Scan count 1, logical reads 346, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   **CPU time = 4625 ms,  elapsed time = 4721 ms.**

Alan (sorted) - serial
Table 'Worktable'. **Scan count 20979, logical reads 563853**, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#strings____________________________________________________________________________________________________________00000000004C'. 
Scan count 1, logical reads 346, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   **CPU time = 1782 ms,  elapsed time = 1790 ms.**

Alan (sorted) - parallel
Table '#strings____________________________________________________________________________________________________________00000000004C'. 
Scan count 9, logical reads 346, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. **Scan count 20979, logical reads 563860**, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   **CPU time = 3762 ms,  elapsed time = 992 ms.**

Alan (unsorted) - serial
Table '#strings____________________________________________________________________________________________________________00000000004C'. 
Scan count 9, logical reads 346, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 **SQL Server Execution Times:
 CPU time = 219 ms,  elapsed time = 217 ms.

Alan (unsorted) - parallel
Table '#strings____________________________________________________________________________________________________________00000000004C'. 
Scan count 9, logical reads 346, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

**SQL Server Execution Times:
  CPU time = 393 ms,  elapsed time = 101 ms.**

select itemNumber = tokenlen/2, leftToken
from dbo.edgeNgrams8k('/A/B/C/D/E')
where tokenlen % 2 = 0;
if object_id('dbo.edgeNgrams8k', 'IF') is not null drop function dbo.edgeNgrams8k;
create function dbo.edgeNgrams8k(@string varchar(8000))
  edgeNgrams8k is an inline table valued function (itvf) that accepts a varchar(8000) 
  input string (@string) and returns a series of character-level left and right edge 
  n-grams. An edge n-gram (referred to herin as an "edge-gram" for brevity) is a type of 
  n-gram (see https://en.wikipedia.org/wiki/N-gram). Instead of a contiguous series of 
  n-sized tokens (n-grams), however, an edge n-gram is a series of tokens that that begin 
  with the input string's first character then increases by one character, the next in the
  string, unitl the token is as long as the input string. 

  Left edge-grams start at the beginning of the string and grow from left-to-right. Right
  edge-grams begin at the end of the string and grow from right-to-left. Note this query
  and the result-set:

  select * from dbo.edgeNgrams8k('ABC');

  tokenlen   leftToken    rightTokenIndex  righttoken
  ---------- ------------ ---------------- ----------
  1          A            3                C
  2          AB           2                BC
  3          ABC          1                ABC

Developer Notes:
 1. For more about N-Grams in SQL Server see: http://www.sqlservercentral.com/articles/Tally+Table/142316/
    For more about Edge N-Grams see the documentation by Elastic here: https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-edgengram-tokenizer.html 

 2. dbo.edgeNgrams8k is deterministic. For more about determinism see: https://docs.microsoft.com/en-us/sql/relational-databases/user-defined-functions/deterministic-and-nondeterministic-functions

 3. If you need to sort this data without getting a sort in your execution plan you can 
    sort by tokenLen for ascending order, or by rightTokenIndex for descending order.

Usage Examples:
  I need to turn /A/B/C/D/E into:

  select leftToken 
    from dbo.edgeNgrams8k('/A/B/C/D/E')
  where tokenLen % 2 = 0

 20171125 - Initial Development - Developed by Alan Burstein  
returns table with schemabinding as return
with iTally(n) as 
  select top (len(@string)) row_number() over (order by (select $))
  from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) a(x), -- 10^1 = 10
       (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) b(x), -- 10^2 = 100
       (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) c(x), -- 10^3 = 1000
       (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) d(x)  -- 10^4 = 10000
select top (convert(bigint, len(@string), 0))
  tokenlen        = n,
  leftToken       = substring(@string,1,n),
  rightTokenIndex = len(@string)+1-n,
  righttoken      = substring(@string,len(@string)+1-n, n)
from itally;



-- Gordon's logic as an inline table valued function
create function dbo.rCTE_GL (@string varchar(8000))
returns table as return
with x as (select @string as col),
     cte as (
      select col
      from x
      union all
      select left(col, len(col) - charindex('/', reverse(col))) as col
      from cte
      where col like '/%/%'
select *
from cte;

-- My logic as a table valued function
create function dbo.tally_AB(@string varchar(8000))
returns table as return    
with iTally(n) as 
( select top (len(@string)/2) (row_number() over (order by (select null))-1)*2+2
  from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) a(x),
       (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) b(x)) -- up to 100 character
select txt = substring(reOrder.newString, 1, n)
from iTally
cross apply
  select '/'+substring(@string,n,1)
  from iTally
  order by substring(@string,n,1)
  for xml path('')
) reOrder(newString);



itemNumber           leftToken
-------------------- -----------
1                    /A
2                    /A/B
3                    /A/B/C
4                    /A/B/C/D
5                    /A/B/C/D/E
Gordon (unsorted)
Table 'Worktable'. **Scan count 100001, logical reads 3492199,** physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#strings____________________________________________________________________________________________________________00000000004C'. 
Scan count 1, logical reads 346, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   **CPU time = 4625 ms,  elapsed time = 4721 ms.**

Alan (sorted) - serial
Table 'Worktable'. **Scan count 20979, logical reads 563853**, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#strings____________________________________________________________________________________________________________00000000004C'. 
Scan count 1, logical reads 346, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   **CPU time = 1782 ms,  elapsed time = 1790 ms.**

Alan (sorted) - parallel
Table '#strings____________________________________________________________________________________________________________00000000004C'. 
Scan count 9, logical reads 346, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. **Scan count 20979, logical reads 563860**, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 SQL Server Execution Times:
   **CPU time = 3762 ms,  elapsed time = 992 ms.**

Alan (unsorted) - serial
Table '#strings____________________________________________________________________________________________________________00000000004C'. 
Scan count 9, logical reads 346, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 **SQL Server Execution Times:
 CPU time = 219 ms,  elapsed time = 217 ms.

Alan (unsorted) - parallel
Table '#strings____________________________________________________________________________________________________________00000000004C'. 
Scan count 9, logical reads 346, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

**SQL Server Execution Times:
  CPU time = 393 ms,  elapsed time = 101 ms.**

给你。根据您的需要和使用的CPU数量,tally table解决方案比递归CTE快4-40倍,并且只需读取一小部分

用你正在使用的数据库标记你的问题。@GordonLinoff DoneTo详细说明Linoff博士的请求:用适当的软件(MySQL、Oracle、DB2等)和版本标记数据库问题是很有帮助的,例如