Sql server SQL Server在列中搜索文本
我不知道该用什么 基本上,我需要有一个搜索字符串,可以搜索多个短语出现的一列,每个输入短语由一个空格分隔 所以来自用户的输入如下:Sql server SQL Server在列中搜索文本,sql-server,full-text-search,contains,sql-like,patindex,Sql Server,Full Text Search,Contains,Sql Like,Patindex,我不知道该用什么 基本上,我需要有一个搜索字符串,可以搜索多个短语出现的一列,每个输入短语由一个空格分隔 所以来自用户的输入如下: "Phrase1 Phrase2 ... PhraseX" (number of phrases can 0 to unknown!, but say < 6) Where 'Phrase1%' **AND** 'Phrase2%' **AND** ... 'PhraseX%' 。。等所以所有的短语都需要找到 总是合乎逻辑和 因此,考虑到速度和性能
"Phrase1 Phrase2 ... PhraseX" (number of phrases can 0 to unknown!, but say < 6)
Where 'Phrase1%' **AND** 'Phrase2%' **AND** ... 'PhraseX%'
。。等所以所有的短语都需要找到
总是合乎逻辑和
因此,考虑到速度和性能,我是否使用:
"Phrase1 Phrase2 ... PhraseX" (number of phrases can 0 to unknown!, but say < 6)
Where 'Phrase1%' **AND** 'Phrase2%' **AND** ... 'PhraseX%'
很多
Like 'Phrase1%' and like 'Phrase2%' and like ... 'PhraseX%' ?
或使用
patindex('Phrase1%' , column) > 0 AND patindex('Phrase2%' , column) > 0
AND ... patindex('PhraseX%' , column)
或使用
patindex('Phrase1%' , column) > 0 AND patindex('Phrase2%' , column) > 0
AND ... patindex('PhraseX%' , column)
添加全文搜索索引
用途:
Where Contatins(Column, 'Phrase1*') AND Contatins(Column, 'Phrase2*') AND ... Contatins(Column, 'PhraseX*')
或
几乎有太多的选择,这就是为什么我要问,做这件事最有效的方法是什么
非常感谢您的智慧…如果您正在搜索和,那么正确的通配符搜索应该是:
Like '%Phrase1%' and like '%Phrase2%' and like ... '%PhraseX%'
这里没有理由使用patindex()
,因为like
就足够了,而且优化得很好。很好地优化了,但这种情况无法提高效率。这将需要一个完整的表格扫描。而且,如果文本字段非常非常大(我的意思是至少有数千或上万个字符),那么性能将不会很好
解决方案是全文搜索。你可以这样说:
where CONTAINS(column, 'Phrase1 AND phrase2 AND . . . ');
这里唯一的问题是当你要寻找的“短语”(似乎是单词)是停止词时
总之,如果您的行数超过几千行,或者正在搜索的文本字段超过几千个字符,则使用全文选项。这只是为了指导。如果您正在搜索一个包含100行的引用表,并在最多包含100个字符的描述字段中查找,那么像
like
方法应该很好。我个人喜欢这个解决方案-
DECLARE @temp TABLE (title NVARCHAR(50))
INSERT INTO @temp (title)
VALUES ('Phrase1 33'), ('test Phrase2'), ('blank')
SELECT t.*
FROM @temp t
WHERE EXISTS(
SELECT 1
FROM (
VALUES ('Phrase1'), ('Phrase2'), ('PhraseX')
) c(t)
WHERE title LIKE '%' + t + '%'
)
理想情况下,这应该在上述全文搜索的帮助下完成。 但是 如果您没有为数据库配置全文,这里有一个性能密集型解决方案,用于执行优先字符串搜索注意:这将返回部分/完整输入单词组合的行(以任意顺序包含搜索字符串的一个或多个单词的行):-
-- table to search in
drop table if exists dbo.myTable;
go
CREATE TABLE dbo.myTable
(
myTableId int NOT NULL IDENTITY (1, 1),
code varchar(200) NOT NULL,
description varchar(200) NOT NULL -- this column contains the values we are going to search in
) ON [PRIMARY]
GO
-- function to split space separated search string into individual words
drop function if exists [dbo].[fnSplit];
go
CREATE FUNCTION [dbo].[fnSplit] (@StringInput nvarchar(max),
@Delimiter nvarchar(1))
RETURNS @OutputTable TABLE (
id nvarchar(1000)
)
AS
BEGIN
DECLARE @String nvarchar(100);
WHILE LEN(@StringInput) > 0
BEGIN
SET @String = LEFT(@StringInput, ISNULL(NULLIF(CHARINDEX(@Delimiter, @StringInput) - 1, -1),
LEN(@StringInput)));
SET @StringInput = SUBSTRING(@StringInput, ISNULL(NULLIF(CHARINDEX
(
@Delimiter, @StringInput
),
0
), LEN
(
@StringInput)
)
+ 1, LEN(@StringInput));
INSERT INTO @OutputTable (id)
VALUES (@String);
END;
RETURN;
END;
GO
-- this is the search script which can be optionally converted to a stored procedure /function
declare @search varchar(max) = 'infection upper acute genito'; -- enter your search string here
-- the searched string above should give rows containing the following
-- infection in upper side with acute genitointestinal tract
-- acute infection in upper teeth
-- acute genitointestinal pain
if (len(trim(@search)) = 0) -- if search string is empty, just return records ordered alphabetically
begin
select 1 as Priority ,myTableid, code, Description from myTable order by Description
return;
end
declare @splitTable Table(
wordRank int Identity(1,1), -- individual words are assinged priority order (in order of occurence/position)
word varchar(200)
)
declare @nonWordTable Table( -- table to trim out auxiliary verbs, prepositions etc. from the search
id varchar(200)
)
insert into @nonWordTable values
('of'),
('with'),
('at'),
('in'),
('for'),
('on'),
('by'),
('like'),
('up'),
('off'),
('near'),
('is'),
('are'),
(','),
(':'),
(';')
insert into @splitTable
select id from dbo.fnSplit(@search,' '); -- this function gives you a table with rows containing all the space separated words of the search like in this e.g., the output will be -
-- id
-------------
-- infection
-- upper
-- acute
-- genito
delete s from @splitTable s join @nonWordTable n on s.word = n.id; -- trimming out non-words here
declare @countOfSearchStrings int = (select count(word) from @splitTable); -- count of space separated words for search
declare @highestPriority int = POWER(@countOfSearchStrings,3);
with plainMatches as
(
select myTableid, @highestPriority as Priority from myTable where Description like @search -- exact matches have highest priority
union
select myTableid, @highestPriority-1 as Priority from myTable where Description like @search + '%' -- then with something at the end
union
select myTableid, @highestPriority-2 as Priority from myTable where Description like '%' + @search -- then with something at the beginning
union
select myTableid, @highestPriority-3 as Priority from myTable where Description like '%' + @search + '%' -- then if the word falls somewhere in between
),
splitWordMatches as( -- give each searched word a rank based on its position in the searched string
-- and calculate its char index in the field to search
select myTable.myTableid, (@countOfSearchStrings - s.wordRank) as Priority, s.word,
wordIndex = CHARINDEX(s.word, myTable.Description) from myTable join @splitTable s on myTable.Description like '%'+ s.word + '%'
-- and not exists(select myTableid from plainMatches p where p.myTableId = myTable.myTableId) -- need not look into rows that have already been found in plainmatches as they are highest ranked
-- this one takes a long time though, so commenting it, will have no impact on the result
),
wordIndexRatings as( -- reverse the char indexes retrived above so that words occuring earlier have higher weightage
-- and then normalize them to sequential values
select myTableid, Priority, word, ROW_NUMBER() over (partition by myTableid order by wordindex desc) as comparativeWordIndex
from splitWordMatches
)
,
wordIndexSequenceRatings as ( -- need to do this to ensure that if the same set of words from search string is found in two rows,
-- their sequence in the field value is taken into account for higher priority
select w.myTableid, w.word, (w.Priority + w.comparativeWordIndex + coalesce(sequncedPriority ,0)) as Priority
from wordIndexRatings w left join
(
select w1.myTableid, w1.priority, w1.word, w1.comparativeWordIndex, count(w1.myTableid) as sequncedPriority
from wordIndexRatings w1 join wordIndexRatings w2 on w1.myTableId = w2.myTableId and w1.Priority > w2.Priority and w1.comparativeWordIndex>w2.comparativeWordIndex
group by w1.myTableid, w1.priority,w1.word, w1.comparativeWordIndex
)
sequencedPriority on w.myTableId = sequencedPriority.myTableId and w.Priority = sequencedPriority.Priority
),
prioritizedSplitWordMatches as ( -- this calculates the cumulative priority for a field value
select w1.myTableId, sum(w1.Priority) as OverallPriority from wordIndexSequenceRatings w1 join wordIndexSequenceRatings w2 on w1.myTableId = w2.myTableId
where w1.word <> w2.word group by w1.myTableid
),
completeSet as (
select myTableid, priority from plainMatches -- get plain matches which should be highest ranked
union
select myTableid, OverallPriority as priority from prioritizedSplitWordMatches -- get ranked split word matches (which are ordered based on word rank in search string and sequence)
union
select myTableid, Priority as Priority from splitWordMatches -- get one word matches
),
maximizedCompleteSet as( -- set the priority of a field value = maximum priority for that field value
select myTableid, max(priority) as Priority from completeSet group by myTableId
)
select priority, myTable.myTableid , code, Description from maximizedCompleteSet m join myTable on m.myTableId = myTable.myTableId
order by Priority desc, Description -- order by priority desc to get highest rated items on top
--offset 0 rows fetch next 50 rows only -- optional paging
--要搜索的表
如果存在dbo.myTable,则删除表;
去
创建表dbo.myTable
(
myTableId int非空标识(1,1),
代码varchar(200)不为空,
description varchar(200)NOT NULL——此列包含我们要搜索的值
)在[小学]
去
--函数将空格分隔的搜索字符串拆分为单个单词
如果存在drop函数[dbo]。[FNSPILT];
去
创建函数[dbo]。[fnSplit](@StringInput nvarchar(最大值),
@分隔符nvarchar(1))
返回@OutputTable表(
身份证号码nvarchar(1000)
)
作为
开始
声明@String nvarchar(100);
而LEN(@StringInput)>0
开始
设置@String=LEFT(@StringInput,ISNULL(null)(CHARINDEX(@Delimiter,@StringInput)-1,-1),
LEN(@StringInput));
设置@StringInput=SUBSTRING(@StringInput,ISNULL(NULLIF(CHARINDEX
(
@分隔符@StringInput
),
0
),LEN
(
@字符串输入)
)
+1,LEN(@StringInput));
插入@OutputTable(id)
值(@String);
结束;
返回;
结束;
去
--这是可以选择性地转换为存储过程/函数的搜索脚本
声明@search varchar(max)=“感染上急性生殖器”;--在此处输入搜索字符串
--上面搜索的字符串应给出包含以下内容的行
--急性生殖肠道上侧感染
--上牙急性感染
--急性生殖肠疼痛
如果(len(trim(@search))=0——如果搜索字符串为空,只返回按字母顺序排列的记录
开始
选择1作为优先级、myTableid、代码、myTable中的说明按说明排序
返回;
结束
声明@splitTable(
wordRank int Identity(1,1),--单个单词被分配优先级顺序(按照出现/位置的顺序)
单词varchar(200)
)
声明@nonWordTable(-Table)以从搜索中删除辅助动词、介词等
id varchar(200)
)
插入到@nonWordTable值中
(‘of’),
(‘with’),
("at"),,
(‘in’),
(‘for’),
("on"),,
(‘by’),
(‘like’),
(‘up’),
(‘关’),
(‘近’),
(‘是’),
(‘是’),
(','),
(':'),
(';')
插入@splitTable
从dbo.fnSplit(@search,')中选择id;--此函数为您提供一个表,其中的行包含搜索中所有空格分隔的单词,例如,输出将为-
--身份证
-------------
--感染
--上层
--尖锐的
--天才
在s.word=n.id;--删掉这里的非单词
声明@CountOfSearchString int=(从@splitTable中选择count(word);--用于搜索的空格分隔词的计数
声明@highestPriority int=POWER(@countOfSearchStrings,3);
以普通火柴为例
(
从myTable中选择myTableid、@highestPriority作为优先级,其中@search——精确匹配的描述具有最高优先级
联合
从myTable中选择myTableid、@highestPriority-1作为优先级,其中的描述类似于@search+'%'--然后在末尾加上一些内容
联合
从myTable中选择myTableid、@highestPriority-2作为优先级,其中的描述类似于“%”+@search——然后在开头加上一些内容
联合
从myTable中选择myTableid、@highestPriority-3作为优先级,其中的描述类似于“%”++@search++'%--然后如果单词介于两者之间
),
splitWordMatches as(-)-根据每个搜索词在搜索字符串中的位置为其指定一个排名
--并在要搜索的字段中计算其char索引
选择myTable.myTableid(@CountOfSearchString-s.wordRank)作为优先级,s.word,
wordIndex=myTable中的CHARINDEX(s.word,myTable.Description)在myTable上连接@splitTable s。描述类似“%”+s.word+“%”
--并且不存在(从p中选择myTableid,其中p.myTableid=