“合并字段”-类似SQL Server函数

“合并字段”-类似SQL Server函数,sql,sql-server,replace,recursive-query,Sql,Sql Server,Replace,Recursive Query,我试图找到一种方法,让SGBD在长文本中执行合并字段填充 创建结构: CREATE TABLE [dbo].[store] ( [id] [int] NOT NULL, [text] [nvarchar](MAX) NOT NULL ) CREATE TABLE [dbo].[statement] ( [id] [int] NOT NULL, [store_id] [int] NOT NULL ) CREATE TABLE [dbo].[statement_m

我试图找到一种方法,让SGBD在长文本中执行合并字段填充

创建结构:

CREATE TABLE [dbo].[store]
(
    [id] [int] NOT NULL,
    [text] [nvarchar](MAX) NOT NULL
)

CREATE TABLE [dbo].[statement]
(
    [id] [int] NOT NULL,
    [store_id] [int] NOT NULL
)

CREATE TABLE [dbo].[statement_merges]
(
    [statement_id] [int] NOT NULL,
    [merge_field] [nvarchar](30) NOT NULL,
    [user_data] [nvarchar](MAX) NOT NULL
)
现在,创建测试值

INSERT INTO [store] (id, text) 
VALUES (1, 'Waw, stackoverflow is an amazing library of lost people in the IT hell, and i have the feeling that $$PERC_SAT$$ of the users found a solution, personally I asked $$ASKED$$ questions.')

INSERT INTO [statement] (id, store_id) 
VALUES (1, 1)

INSERT INTO [statement_merges] (statement_id, merge_field, user_data) 
VALUES (1, '$$PERC_SAT$$', '85%')

INSERT INTO [statement_merges] (statement_id, merge_field, user_data) 
VALUES (1, '$$ASKED$$', '12')
目前,我的应用程序正在传递最终语句,通过合并循环,替换存储的文本和输出

哇,stackoverflow是一个惊人的图书馆,里面有在IT领域迷路的人 见鬼,我觉得85%的用户找到了解决方案, 我个人问了12个问题

我试图找到一种独立于代码并在单个查询中提供输出的方法,正如您所理解的,选择一条语句,其中存储的文本已填充了用户数据。我希望我明白了

我查看了TRANSLATE函数,但它看起来像一个字符替换,所以我有两个选择:

我尝试一个递归函数,逐个替换,直到在计算文本中找不到merge_字段;但我对这种方法的性能表示怀疑; 这是一种魔力,但我需要你的知识。。。 考虑到我想要这个,因为真正的文本非常长,我不想在我的数据库中多次存储它。您可以想象一个只有12个参数的3页合同,如开始日期、发票金额等。。。其他一切都无法更改以符合法规要求

谢谢你抽出时间

编辑:

多亏了Randy的帮助,这看起来很有用:

WITH cte_replace_tokens AS (

    SELECT replace(r.text, m.merge_field, m.user_data) as [final], m.merge_field, s.id, 1 AS i
    FROM store r
    INNER JOIN statement s ON s.store_id = r.id
    INNER JOIN statement_merges m ON m.statement_id = s.id
    WHERE m.statement_id = 1

    UNION ALL

    SELECT replace(r.final, m.merge_field, m.user_data) as [final], m.merge_field, r.id, r.i + 1 AS i
    FROM cte_replace_tokens r
    INNER JOIN statement_merges m ON m.statement_id = r.id
    WHERE m.merge_field > r.merge_field

) 

select TOP 1 final from cte_replace_tokens ORDER BY i DESC
我会检查一个更大的数据库,如果性能好

至少,我可以填充一条语句,我需要计算出能够提取列表


再次感谢

不建议在sql引擎中执行此类任务,但如果您想这样做,则需要在函数或存储过程中使用游标在循环中执行,如下所示:

DECLARE @merge_field nvarchar(30)
    , @user_data nvarchar(MAX)
    , @statementid INT = 1 
    , @text varchar(MAX) = 'Waw, stackoverflow is an amazing library of lost people in the IT hell, and i have the feeling that $$PERC_SAT$$ of the users found a solution, personally I asked $$ASKED$$ questions.'

DECLARE  merge_statements CURSOR FAST_FORWARD
 FOR SELECT 
    sm.merge_field
    , sm.user_data 
    FROM dbo.statement_merges AS sm
    WHERE sm.statement_id = @statementid

 OPEN merge_statements
 FETCH NEXT FROM merge_statements
 INTO @merge_field , @user_data
 WHILE @@FETCH_STATUS = 0  
  BEGIN
    set @text = REPLACE(@text , @merge_field, @user_data )
    FETCH NEXT FROM merge_statements
    INTO @merge_field , @user_data
END 
CLOSE   merge_statements
DEALLOCATE merge_statements

SELECT @text

如果一条记录被同一次更新多次,则最后一条记录获胜。所有更新均不受其他更新的影响-无累积影响。在某些情况下,可以使用局部变量欺骗SQL以获得累积效果,但这很棘手,不推荐使用。顺序变得很重要,并且在更新中不可靠

一种替代方法是CTE中的递归。在替换每个令牌直到没有令牌时,从上一个令牌生成新记录。这里有一个工作示例,用a替换1,用B替换2,等等。我想知道是否有一些棘手的xml也可以做到这一点

if not object_id('tempdb..#Raw') is null drop table #Raw
CREATE TABLE #Raw(
    [test] [varchar](100) NOT NULL PRIMARY KEY CLUSTERED,
)

if not object_id('tempdb..#Token') is null drop table #Token
CREATE TABLE #Token(
    [id] [int] NOT NULL PRIMARY KEY CLUSTERED,
    [token] [char](1) NOT NULL,
    [value] [char](1) NOT NULL,
)

insert into #Raw values('123456'), ('1122334456')
insert into #Token values(1, '1', 'A'), (2, '2', 'B'), (3, '3', 'C'), (4, '4', 'D'), (5, '5', 'E'), (6, '6', 'F');

WITH cte_replace_tokens AS (

    SELECT r.test, replace(r.test, l.token, l.value) as [final], l.id
    FROM [Raw] r
    CROSS JOIN #Token l
    WHERE l.id = 1

    UNION ALL

    SELECT r.test, replace(r.final, l.token, l.value) as [final], l.id
    FROM cte_replace_tokens r
    CROSS JOIN #Token l
    WHERE l.id = r.id + 1

) 
select * from cte_replace_tokens where id = 6

这是一个递归解决方案。

MS SQL Server 2017架构设置:

问题1:

:


在兰迪的帮助下,我想我已经实现了我想做的事情

我知道我的真实案例是一份合同,其中有几条陈述可能是:

自由文本 没有任何合并的存储文本 带有一个或多个字符的存储文本 几次合并 这台CTE完成了任务

WITH cte_replace_tokens AS (

    -- The initial query dont join on merges neither on store because can be a free text 
    SELECT COALESCE(r.text, s.part_text) AS [final], CAST('' AS NVARCHAR) AS merge_field, s.id, 1 AS i, s.contract_id
    FROM statement s
    LEFT JOIN store r ON s.store_id = r.id
    
    UNION ALL
    
    -- We loop till the last merge field, output contains iteration to be able to keep the last record ( all fields updated )
    SELECT replace(r.final, m.merge_field, m.user_data) as [final], m.merge_field, r.id, r.i + 1 AS i, r.contract_id
    FROM cte_replace_tokens r
    INNER JOIN statement_merges m ON m.statement_id = r.id
    WHERE m.merge_field > r.merge_field AND r.final LIKE '%' + m.merge_field + '%'
    -- spare lost replacements by forcing only one merge_field per loop
    AND NOT EXISTS( SELECT mm.statement_id FROM statement_merges mm WHERE mm.statement_id = m.statement_id AND mm.merge_field > r.merge_field AND mm.merge_field < m.merge_field)
) 

select s.id, 
(select top 1 final from cte_replace_tokens t WHERE t.contract_id = s.contract_id AND t.id = s.id ORDER BY i DESC) as res
FROM statement s
where contract_id = 1

如果带有交叉连接的CTE解决方案速度太慢,另一种解决方案是动态构建标量fn,该标量fn包含令牌表中所需的每个替换。每个记录一个标量fn调用就是orderN。我得到了和以前一样的结果

该函数很简单,可能不会太长,具体取决于令牌表的大小…256 MB批处理限制。我看到过试图动态创建查询以提高性能的尝试适得其反——将问题转移到了编译时。这不应该是个问题

if not object_id('tempdb..#Raw') is null drop table #Raw
CREATE TABLE #Raw(
    [test] [varchar](100) NOT NULL PRIMARY KEY CLUSTERED,
)

if not object_id('tempdb..#Token') is null drop table #Token
CREATE TABLE #Token(
    [id] [int] NOT NULL PRIMARY KEY CLUSTERED,
    [token] [char](1) NOT NULL,
    [value] [char](1) NOT NULL,
)

insert into #Raw values('123456'), ('1122334456')
insert into #Token values(1, '1', 'A'), (2, '2', 'B'), (3, '3', 'C'), (4, '4', 'D'), (5, '5', 'E'), (6, '6', 'F');

DECLARE @sql varchar(max) = 'CREATE FUNCTION dbo.fn_ReplaceTokens(@raw varchar(8000)) RETURNS varchar(8000) AS BEGIN RETURN ';

WITH cte_replace_statement AS (

    SELECT a.id, CAST('replace(@raw,''' + a.token + ''',''' + a.value + ''')' as varchar(max)) as [statement]
    FROM #Token a
    WHERE a.id = 1

    UNION ALL

    SELECT n.id, CAST(replace(l.[statement], '@raw', 'replace(@raw,''' + n.token + ''',''' + n.value + ''')') as varchar(max)) as [statement]
    FROM #Token n
    INNER JOIN cte_replace_statement l
    ON n.id = l.id + 1

) 
select @sql += [statement] + ' END' from cte_replace_statement where id = 6

print @sql

if not object_id('dbo.fn_ReplaceTokens') is null drop function dbo.fn_ReplaceTokens
execute (@sql)

SELECT r.test, dbo.fn_ReplaceTokens(r.test) as [final] FROM [Raw] r

我会仔细阅读答案,谢谢。但是你能给我一个提示吗?不建议让sql去做什么?谢谢Randy,这对我很有帮助。但是交叉连接不是很危险吗?我会有很多键,很多语句,很多存储。我想在我的编辑中有一个注释:如果令牌表真的很大,那么索引可能会有所帮助。如果很小,它无论如何都会扫描它。我不知道如何使用令牌表避免orderN ^2。谢谢你给我足够的分数让我发表评论。如果你看到我下面的答案,多亏了你,我想出了一个方法来进行递归,每个标记只做一个过程。在你的问题中,你让存储和语句之间看起来可能有很多。我试图在这些限制下回答。如果您正在寻找直接的代币替换算法,那么这个答案和任何答案一样好。您必须在一个事务中完成所有记录吗?指定合同的所有记录是。例如,我存储了5000条记录,对于一个特定的合同,我有5到20条语句,每个语句中大约有0到50个merge_字段。我想输出视图中公开的填充语句列表。
| store_id | statement_id |                                 new_text |                                                     old_text |
|----------|--------------|------------------------------------------|--------------------------------------------------------------|
|        1 |            1 | Wow, stackoverflow...85%...12 questions. | $$(*)$$, stackoverflow...$$PERC_SAT$$...$$ASKED$$ questions. |
|        2 |            2 |                             Use TheFlux! |                                                  Use The @_@ |
WITH cte_replace_tokens AS (

    -- The initial query dont join on merges neither on store because can be a free text 
    SELECT COALESCE(r.text, s.part_text) AS [final], CAST('' AS NVARCHAR) AS merge_field, s.id, 1 AS i, s.contract_id
    FROM statement s
    LEFT JOIN store r ON s.store_id = r.id
    
    UNION ALL
    
    -- We loop till the last merge field, output contains iteration to be able to keep the last record ( all fields updated )
    SELECT replace(r.final, m.merge_field, m.user_data) as [final], m.merge_field, r.id, r.i + 1 AS i, r.contract_id
    FROM cte_replace_tokens r
    INNER JOIN statement_merges m ON m.statement_id = r.id
    WHERE m.merge_field > r.merge_field AND r.final LIKE '%' + m.merge_field + '%'
    -- spare lost replacements by forcing only one merge_field per loop
    AND NOT EXISTS( SELECT mm.statement_id FROM statement_merges mm WHERE mm.statement_id = m.statement_id AND mm.merge_field > r.merge_field AND mm.merge_field < m.merge_field)
) 

select s.id, 
(select top 1 final from cte_replace_tokens t WHERE t.contract_id = s.contract_id AND t.id = s.id ORDER BY i DESC) as res
FROM statement s
where contract_id = 1
if not object_id('tempdb..#Raw') is null drop table #Raw
CREATE TABLE #Raw(
    [test] [varchar](100) NOT NULL PRIMARY KEY CLUSTERED,
)

if not object_id('tempdb..#Token') is null drop table #Token
CREATE TABLE #Token(
    [id] [int] NOT NULL PRIMARY KEY CLUSTERED,
    [token] [char](1) NOT NULL,
    [value] [char](1) NOT NULL,
)

insert into #Raw values('123456'), ('1122334456')
insert into #Token values(1, '1', 'A'), (2, '2', 'B'), (3, '3', 'C'), (4, '4', 'D'), (5, '5', 'E'), (6, '6', 'F');

DECLARE @sql varchar(max) = 'CREATE FUNCTION dbo.fn_ReplaceTokens(@raw varchar(8000)) RETURNS varchar(8000) AS BEGIN RETURN ';

WITH cte_replace_statement AS (

    SELECT a.id, CAST('replace(@raw,''' + a.token + ''',''' + a.value + ''')' as varchar(max)) as [statement]
    FROM #Token a
    WHERE a.id = 1

    UNION ALL

    SELECT n.id, CAST(replace(l.[statement], '@raw', 'replace(@raw,''' + n.token + ''',''' + n.value + ''')') as varchar(max)) as [statement]
    FROM #Token n
    INNER JOIN cte_replace_statement l
    ON n.id = l.id + 1

) 
select @sql += [statement] + ' END' from cte_replace_statement where id = 6

print @sql

if not object_id('dbo.fn_ReplaceTokens') is null drop function dbo.fn_ReplaceTokens
execute (@sql)

SELECT r.test, dbo.fn_ReplaceTokens(r.test) as [final] FROM [Raw] r