elasticsearch 无法搜索等于*abc的令牌
假设我有这样的索引文档:1:abc,2:*abc,3:abc def,4:def*abc,5:1abc 我希望搜索的行为如下:elasticsearch 无法搜索等于*abc的令牌,elasticsearch,nest,elasticsearch,Nest,假设我有这样的索引文档:1:abc,2:*abc,3:abc def,4:def*abc,5:1abc 我希望搜索的行为如下: Add("myAnalyzer", new CustomAnalyzer { Tokenizer = "myTokenizer", Filter = new[] { "myAsciiFolding" ,"lowercase"
Add("myAnalyzer", new CustomAnalyzer
{
Tokenizer = "myTokenizer",
Filter = new[]
{
"myAsciiFolding"
,"lowercase"
,"ipPattern"
}
}
Add("ipTokenizer", new PatternTokenizer
{
Pattern = @"\W+"
})
Add("ipAsciiFolding", new AsciiFoldingTokenFilter
{
PreserveOriginal = true
})
搜索=abc结果=1,2,3,4,5
搜索=*abc结果=2,4
我使用如下定义的自定义分析器:
Add("myAnalyzer", new CustomAnalyzer
{
Tokenizer = "myTokenizer",
Filter = new[]
{
"myAsciiFolding"
,"lowercase"
,"ipPattern"
}
}
Add("ipTokenizer", new PatternTokenizer
{
Pattern = @"\W+"
})
Add("ipAsciiFolding", new AsciiFoldingTokenFilter
{
PreserveOriginal = true
})
使用如下定义的标记器:
Add("myAnalyzer", new CustomAnalyzer
{
Tokenizer = "myTokenizer",
Filter = new[]
{
"myAsciiFolding"
,"lowercase"
,"ipPattern"
}
}
Add("ipTokenizer", new PatternTokenizer
{
Pattern = @"\W+"
})
Add("ipAsciiFolding", new AsciiFoldingTokenFilter
{
PreserveOriginal = true
})
然后像这样折叠:
Add("myAnalyzer", new CustomAnalyzer
{
Tokenizer = "myTokenizer",
Filter = new[]
{
"myAsciiFolding"
,"lowercase"
,"ipPattern"
}
}
Add("ipTokenizer", new PatternTokenizer
{
Pattern = @"\W+"
})
Add("ipAsciiFolding", new AsciiFoldingTokenFilter
{
PreserveOriginal = true
})
实际上,搜索1成功,但第二个(带“*”)返回的结果与第一个相同。有没有一种方法可以指定多个标记器来完成我期望的任务
有什么想法吗
Thx,要执行此操作:
搜索=abc结果=1,2,3,4,5搜索=*abc结果=2,4
当您在字符串中搜索时(在“*abc”中查找“abc”),您不希望“*abc”的搜索与“*def abc”匹配,我将使用它来标记数据
curl -XPUT 'localhost:9200/test' -d '
{
"settings" : {
"analysis" : {
"analyzer" : {
"my_ngram_analyzer" : {
"tokenizer" : "my_ngram_tokenizer"
}
},
"tokenizer" : {
"my_ngram_tokenizer" : {
"type" : "nGram",
"min_gram" : "2",
"max_gram" : "5",
"token_chars": [ "letter", "digit", "punctuation", "symbol" ]
}
}
}
}
}'
如果您的术语(*abc等)都是5个字符或更少,那么我将使用查询(即,您将在索引中找到一个完全匹配的术语)
如果您的术语长度超过5个字符,我将使用a并将默认_运算符设置为,并且您在映射中使用的分析器是什么?如果您希望将*视为数据而不被忽略,那么您可能需要切换到。