Sql server 告诉我SQL Server全文搜索器是疯狂的,不是我
我有一些客户具有用户正在搜索的特定地址: 123通用方式 数据库中有5行匹配:Sql server 告诉我SQL Server全文搜索器是疯狂的,不是我,sql-server,sql-server-2000,full-text-search,full-text-indexing,Sql Server,Sql Server 2000,Full Text Search,Full Text Indexing,我有一些客户具有用户正在搜索的特定地址: 123通用方式 数据库中有5行匹配: ResidentialAddress1 ============================= 123 GENERIC WAY 123 GENERIC WAY 123 GENERIC WAY 123 GENERIC WAY 123 GENERIC WAY 我运行FT查询以查找这些行。在向搜索添加更多条件时,我将向您展示每个步骤: SELECT ResidentialAddress1 FROM Patrons
ResidentialAddress1
=============================
123 GENERIC WAY
123 GENERIC WAY
123 GENERIC WAY
123 GENERIC WAY
123 GENERIC WAY
我运行FT查询以查找这些行。在向搜索添加更多条件时,我将向您展示每个步骤:
SELECT ResidentialAddress1 FROM Patrons
WHERE CONTAINS(Patrons.ResidentialAddress1, '"123*"')
ResidentialAddress1
=========================
123 MAPLE STREET
12345 TEST
123 MINE STREET
123 GENERIC WAY
123 FAKE STREET
...
(30 row(s) affected)
好的,到目前为止还不错,现在添加“通用”一词:
好极了。现在,我将添加用户希望确保存在的最后一个关键字:
SELECT ResidentialAddress1 FROM Patrons
WHERE CONTAINS(Patrons.ResidentialAddress1, '"123*"')
AND CONTAINS(Patrons.ResidentialAddress1, '"generic*"')
AND CONTAINS(Patrons.ResidentialAddress1, '"way*"')
ResidentialAddress1
------------------------------
(0 row(s) affected)
嗯?没有争吵?如果我只是询问“方式*”,会怎样:
起初我认为这可能是因为*
,它要求根路径
后面有更多字符。但事实并非如此:
- 搜索“123*”与“123”匹配
- 搜索“generic*”匹配“generic”
- 在线图书说,星号匹配零个、一个或多个字符
*
,该怎么办
SELECT ResidentialAddress1 FROM Patrons
WHERE CONTAINS(Patrons.ResidentialAddress1, '"way"')
Server: Msg 7619, Level 16, State 1, Line 1
A clause of the query contained only ignored words.
因此,有人可能会认为,您甚至不允许单独或作为根用户搜索
way
。但这也不是事实:
SELECT * FROM Patrons
WHERE CONTAINS(Patrons.*, '"way*"')
AccountNumber FirstName Lastname
------------- --------- --------
33589 JOHN WAYNE
总之,用户正在搜索包含所有单词的行: 123通用方式 我正确地将其翻译为
WHERE
子句:
SELECT * FROM Patrons
WHERE CONTAINS(Patrons.*, '"123*"')
AND CONTAINS(Patrons.*, '"generic*"')
AND CONTAINS(Patrons.*, '"way*"')
它不返回任何行。告诉我这是行不通的,这不是我的错,SQL Server是疯了
注意:我已经清空了FT索引并重新构建了它
更新一
更新二
假装用户键入:
123通用wa
真正的问题是,用户输入的东西完全有效,他们希望看到任何人都希望看到的东西
更新三 这一切都是有人要的,不是我的错
CREATE TABLE [dbo].[Patrons] (
[PatronGUID] uniqueidentifier ROWGUIDCOL NOT NULL ,
[AccountNumber] [bigint] NULL ,
[FirstName] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[MiddleInitial] [varchar] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[Lastname] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[EyeColor] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[HairColor] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[Gender] [varchar] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[Birthday] [datetime] NULL ,
[Height] [int] NULL ,
[Weight] [int] NULL ,
[FacialHair] [tinyint] NULL ,
[Nationality] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[IdentifyingMarks] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[DriversLicenseNumber] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[DriversLicenseRegion] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[DriversLicenseCountry] [varchar] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[DriversLicenseExpires] [datetime] NULL ,
[DriversLicenseDateVerified] [datetime] NULL ,
[PassportNumber] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[PassportRegion] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[PassportCountry] [varchar] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[PassportExpires] [datetime] NULL ,
[PassportDateVerified] [datetime] NULL ,
[OtherIdentificationNumber] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[OtherIdentificationRegion] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[OtherIdentificationCountry] [varchar] (2) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[OtherIdentificationType] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[OtherIdentificationExpires] [datetime] NULL ,
[OtherIdentificationDateVerified] [datetime] NULL ,
[ResidentialAddress1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ResidentialAddress2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ResidentialAddress3] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ResidentialCity] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ResidentialZipCode] [varchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ResidentialRegion] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ResidentialCountry] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ResidentialPhoneNumber] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[CountryOfResidence] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[BusinessAddress1] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[BusinessAddress2] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[BusinessAddress3] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[BusinessCity] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[BusinessRegion] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[BusinessZipCode] [varchar] (15) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[BusinessCountry] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[BusinessName] [varchar] (25) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[BusinessPhone] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[PositionWithFirm] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[EmployerTelephone] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[MemberCardType] [varchar] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[PlayerStatusCode] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[AccountType] [varchar] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[AccountStatus1] [varchar] (1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[AccountStatus2] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[IsVIPExchangeRate] [tinyint] NULL ,
[ChangedUserGUID_Depricated] [uniqueidentifier] NULL ,
[ChangedUser] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[ChangedDate] [datetime] NULL ,
[ChangedWorkstation] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[PendingUpdates_Depricated] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[Patrons] ADD
CONSTRAINT [DF_Patrons_PatronGUID] DEFAULT (newid()) FOR [PatronGUID],
CONSTRAINT [PK_Patrons] PRIMARY KEY NONCLUSTERED
(
[PatronGUID]
) WITH FILLFACTOR = 90 ON [PRIMARY]
GO
if (select DATABASEPROPERTY(DB_NAME(), N'IsFullTextEnabled')) <> 1
exec sp_fulltext_database N'enable'
GO
if not exists (select * from dbo.sysfulltextcatalogs where name = N'TheFullTextCatalog')
exec sp_fulltext_catalog N'TheFullTextCatalog', N'create'
GO
exec sp_fulltext_table N'[dbo].[Patrons]', N'create', N'TheFullTextCatalog', N'PK_Patrons'
GO
exec sp_fulltext_column N'[dbo].[Patrons]', N'FirstName', N'add', 1033
GO
exec sp_fulltext_column N'[dbo].[Patrons]', N'MiddleInitial', N'add', 1033
GO
exec sp_fulltext_column N'[dbo].[Patrons]', N'Lastname', N'add', 1033
GO
exec sp_fulltext_column N'[dbo].[Patrons]', N'EyeColor', N'add', 1033
GO
exec sp_fulltext_column N'[dbo].[Patrons]', N'IdentifyingMarks', N'add', 1033
GO
exec sp_fulltext_column N'[dbo].[Patrons]', N'ResidentialAddress1', N'add', 1033
GO
exec sp_fulltext_column N'[dbo].[Patrons]', N'ResidentialAddress2', N'add', 1033
GO
exec sp_fulltext_column N'[dbo].[Patrons]', N'ResidentialAddress3', N'add', 1033
GO
exec sp_fulltext_column N'[dbo].[Patrons]', N'ResidentialCity', N'add', 1033
GO
exec sp_fulltext_column N'[dbo].[Patrons]', N'ResidentialZipCode', N'add', 1033
GO
exec sp_fulltext_column N'[dbo].[Patrons]', N'ResidentialRegion', N'add', 1033
GO
exec sp_fulltext_column N'[dbo].[Patrons]', N'ResidentialCountry', N'add', 1033
GO
exec sp_fulltext_column N'[dbo].[Patrons]', N'ResidentialPhoneNumber', N'add', 1033
GO
exec sp_fulltext_column N'[dbo].[Patrons]', N'CountryOfResidence', N'add', 1033
GO
exec sp_fulltext_table N'[dbo].[Patrons]', N'activate'
GO
因为有一些项目在逻辑上或物理上都不在《金融时报》索引的涵盖范围之内。e、 g.用户查询
用于:
2010年6月4日伊恩·博伊德619
提供四个关键词:
- 2010年6月4日
- 伊恩
- 博伊德
- 619
WHERE 6/4/2010 is in the row
AND ian is in the row
AND boyd is in the row
AND 619 is in the row
将其转换为以下内容的部分查询:
WHERE --Keyword 1: 6/4/2010
(
((ChangedDate >= '20100604') AND (ChangedDate < '20100605'))
OR
((LastTransactionDate >= '20100604') AND (LastTransactionDate < '20100605'))
OR
(CONTAINS(Patrons.*, '"6/4/2010*"')
)
AND --Keyword 2: ian
(
CONTAINS(Patrons.*, '"ian*"')
)
AND --Keyword 3: boyd
(
CONTAINS(Patrons.*, '"boyd*"')
)
AND --Keyword 4: 619
(
(AccountNumber IN (SELECT CAST(619 AS bigint)))
OR
(CONTAINS(Patrons.*, '"619*"'))
)
WHERE--关键字1:6/4/2010
(
((ChangedDate>='20100604')和(ChangedDate<'20100605'))
或
((LastTransactionDate>='20100604')和(LastTransactionDate<'20100605'))
或
(包含(用户。*,“6/4/2010*”)
)
关键词2:伊恩
(
包含(用户*,“ian*”)
)
关键词3:博伊德
(
包含(用户*,“boyd*”)
)
和——关键字4:619
(
(中的AccountNumber(选择CAST(619作为bigint)))
或
(包含(用户*,“619*”))
)
其中一位回答者在看原始问题中给出的简化示例;而不是现实世界。说有多个
和子句是不正确的,就是nieve。消息告诉你,“way”是一个停止词,这意味着它被忽略,没有索引。这就是为什么你可以找到“wayne”但不是“方式”
所以,不,这不疯狂,你也不疯狂。这只是一个简单的误解。也许它需要三个以上的字母。试试另一个三个字母的单词,比如gen*
在那篇文章中,Jeff提到可以关闭它们。解决方案1:
您想尝试选项(SQL 2008)
如果关闭此选项,应停止删除单词
例子:
编辑1:
希望旧版本的MS SQL也有类似的功能?您可能在创建FT索引时使用了系统停止列表。单词way
正好在其中。您可以通过以下查询看到它:
SELECT *
FROM sys.fulltext_system_stopwords
WHERE stopword = 'way'
AND language_id = 1033
您可以关闭停止列表或创建自定义的停止列表,但更好的解决方案是正确编写查询;不要使用多个WHERE CONTAINS
子句,将它们组合成一个子句。否则SQL Server可能无法有效地使用FT索引
您的查询应该如下所示:
SELECT ResidentialAddress1 FROM Patrons
WHERE CONTAINS(Patrons.ResidentialAddress1, '"123*" AND "generic*" AND "way*"')
如果您这样做,停止字就会被忽略;它仍然会返回与未包含术语way*
时返回的结果相同的所有结果
编辑:刚刚注意到您标记了这个sql-server-2000
,因此第一个查询可能无法工作。在sql 2000中,它们是“噪音词”我相信配置是全局的,您没有单独的停止列表。然而,如果您编写一个WHERE CONTAINS
子句而不是几个子句,您仍然会得到结果
要编辑SQL Server 2000中的干扰词,您必须编辑SQL ServerFTDATA
配置文件夹中特定于语言的文件。更多详细信息如下:。您有什么建议?即,如何告诉SQL Server在可以搜索路径时搜索路径,而在无法搜索路径时不搜索路径。@Ardman尝试过“wa*”,结果添加到问题中。不起作用。@伊恩:我对两个更详细的答案投了赞成票,所以试试这些。简短的回答是你需要删除“way”从您的停止列表和重新索引。/facepalm我希望有任何其他解决方案,而不是在我无权访问的机器上手动编辑文本文件的解决方案。最终,该解决方案可能必须涉及转换CONTAINS(表。*)
分成几十个类
子句…@Ian:根据我的经验,一开始使用数据库的FT搜索功能的项目往往最终会将这一方面转移到Lucene和DTSearch等引擎,这正是因为关系数据库有限且令人沮丧。“gen*”起作用,这是有意义的,因为“123*”同样有效。无论如何都会增加一个问题SQL Server全文搜索引擎是疯狂的,不是你!;-)这是一个多么奇妙的问题。我无法理解这只收到了2张赞成票。运行SELECT*FROM customs,其中包含(customers.*,“123*”和“generic*”以及“way*”)
不会返回任何行。你们呢
CONTAINS(Patrons.*, 'words...')
WHERE 6/4/2010 is in the row
AND ian is in the row
AND boyd is in the row
AND 619 is in the row
WHERE --Keyword 1: 6/4/2010
(
((ChangedDate >= '20100604') AND (ChangedDate < '20100605'))
OR
((LastTransactionDate >= '20100604') AND (LastTransactionDate < '20100605'))
OR
(CONTAINS(Patrons.*, '"6/4/2010*"')
)
AND --Keyword 2: ian
(
CONTAINS(Patrons.*, '"ian*"')
)
AND --Keyword 3: boyd
(
CONTAINS(Patrons.*, '"boyd*"')
)
AND --Keyword 4: 619
(
(AccountNumber IN (SELECT CAST(619 AS bigint)))
OR
(CONTAINS(Patrons.*, '"619*"'))
)
sp_configure 'show advanced options', 1
RECONFIGURE
GO
sp_configure 'transform noise words', 1
RECONFIGURE
GO
SELECT *
FROM sys.fulltext_system_stopwords
WHERE stopword = 'way'
AND language_id = 1033
SELECT ResidentialAddress1 FROM Patrons
WHERE CONTAINS(Patrons.ResidentialAddress1, '"123*" AND "generic*" AND "way*"')