Java 需要在弹性搜索中搜索词间,请告知我如何处理
嗨,我正在寻找搜索功能,我们需要在弹性搜索词之间搜索。Ryt现在我们的搜索工作就像我们要搜索“公司名称”一样,我们需要用“c”、“n”、“co”、“comp”、“na”、“nam”进行搜索,但如果我们用“mp”、“any”、“ame”、“me”、“p”进行搜索,则需要的结果应该是“公司名称”。请告知我们如何处理此问题。如果存在任何此类搜索功能,我尝试了wild card,但它不适用于多个字段。如果我遗漏了任何内容,请告知或建议我如何实现 你可以先用它把文本分解成 每当遇到指定字符列表中的一个字符时, 然后,它发出指定长度的每个字的N克 添加包含索引数据、映射、搜索查询和结果的工作示例 索引映射:Java 需要在弹性搜索中搜索词间,请告知我如何处理,java,
elasticsearch,search,Java,
elasticsearch,Search,嗨,我正在寻找搜索功能,我们需要在弹性搜索词之间搜索。Ryt现在我们的搜索工作就像我们要搜索“公司名称”一样,我们需要用“c”、“n”、“co”、“comp”、“na”、“nam”进行搜索,但如果我们用“mp”、“any”、“ame”、“me”、“p”进行搜索,则需要的结果应该是“公司名称”。请告知我们如何处理此问题。如果存在任何此类搜索功能,我尝试了wild card,但它不适用于多个字段。如果我遗漏了任何内容,请告知或建议我如何实现 你可以先用它把文本分解成 每当遇到指定字符列表中的一个字符
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "ngram",
"min_gram": 1,
"max_gram": 20,
"token_chars": [
"letter",
"digit"
]
}
}
},
"max_ngram_diff": 50
},
"mappings": {
"properties": {
"body": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
{
"body": "company name"
}
{
"query": {
"match": {
"body": "ame"
}
}
}
"hits": [
{
"_index": "64975316",
"_type": "_doc",
"_id": "1",
"_score": 1.941854,
"_source": {
"body": "company name"
}
}
]
分析API
GET/_analzye
{
"analyzer" : "my_analyzer",
"text" : "company name"
}
生成以下令牌
{
"tokens": [
{
"token": "c",
"start_offset": 0,
"end_offset": 1,
"type": "word",
"position": 0
},
{
"token": "co",
"start_offset": 0,
"end_offset": 2,
"type": "word",
"position": 1
},
{
"token": "com",
"start_offset": 0,
"end_offset": 3,
"type": "word",
"position": 2
},
{
"token": "comp",
"start_offset": 0,
"end_offset": 4,
"type": "word",
"position": 3
},
{
"token": "compa",
"start_offset": 0,
"end_offset": 5,
"type": "word",
"position": 4
},
{
"token": "compan",
"start_offset": 0,
"end_offset": 6,
"type": "word",
"position": 5
},
{
"token": "company",
"start_offset": 0,
"end_offset": 7,
"type": "word",
"position": 6
},
{
"token": "o",
"start_offset": 1,
"end_offset": 2,
"type": "word",
"position": 7
},
{
"token": "om",
"start_offset": 1,
"end_offset": 3,
"type": "word",
"position": 8
},
{
"token": "omp",
"start_offset": 1,
"end_offset": 4,
"type": "word",
"position": 9
},
{
"token": "ompa",
"start_offset": 1,
"end_offset": 5,
"type": "word",
"position": 10
},
{
"token": "ompan",
"start_offset": 1,
"end_offset": 6,
"type": "word",
"position": 11
},
{
"token": "ompany",
"start_offset": 1,
"end_offset": 7,
"type": "word",
"position": 12
},
{
"token": "m",
"start_offset": 2,
"end_offset": 3,
"type": "word",
"position": 13
},
{
"token": "mp",
"start_offset": 2,
"end_offset": 4,
"type": "word",
"position": 14
},
{
"token": "mpa",
"start_offset": 2,
"end_offset": 5,
"type": "word",
"position": 15
},
{
"token": "mpan",
"start_offset": 2,
"end_offset": 6,
"type": "word",
"position": 16
},
{
"token": "mpany",
"start_offset": 2,
"end_offset": 7,
"type": "word",
"position": 17
},
{
"token": "p",
"start_offset": 3,
"end_offset": 4,
"type": "word",
"position": 18
},
{
"token": "pa",
"start_offset": 3,
"end_offset": 5,
"type": "word",
"position": 19
},
{
"token": "pan",
"start_offset": 3,
"end_offset": 6,
"type": "word",
"position": 20
},
{
"token": "pany",
"start_offset": 3,
"end_offset": 7,
"type": "word",
"position": 21
},
{
"token": "a",
"start_offset": 4,
"end_offset": 5,
"type": "word",
"position": 22
},
{
"token": "an",
"start_offset": 4,
"end_offset": 6,
"type": "word",
"position": 23
},
{
"token": "any",
"start_offset": 4,
"end_offset": 7,
"type": "word",
"position": 24
},
{
"token": "n",
"start_offset": 5,
"end_offset": 6,
"type": "word",
"position": 25
},
{
"token": "ny",
"start_offset": 5,
"end_offset": 7,
"type": "word",
"position": 26
},
{
"token": "y",
"start_offset": 6,
"end_offset": 7,
"type": "word",
"position": 27
},
{
"token": "n",
"start_offset": 8,
"end_offset": 9,
"type": "word",
"position": 28
},
{
"token": "na",
"start_offset": 8,
"end_offset": 10,
"type": "word",
"position": 29
},
{
"token": "nam",
"start_offset": 8,
"end_offset": 11,
"type": "word",
"position": 30
},
{
"token": "name",
"start_offset": 8,
"end_offset": 12,
"type": "word",
"position": 31
},
{
"token": "a",
"start_offset": 9,
"end_offset": 10,
"type": "word",
"position": 32
},
{
"token": "am",
"start_offset": 9,
"end_offset": 11,
"type": "word",
"position": 33
},
{
"token": "ame",
"start_offset": 9,
"end_offset": 12,
"type": "word",
"position": 34
},
{
"token": "m",
"start_offset": 10,
"end_offset": 11,
"type": "word",
"position": 35
},
{
"token": "me",
"start_offset": 10,
"end_offset": 12,
"type": "word",
"position": 36
},
{
"token": "e",
"start_offset": 11,
"end_offset": 12,
"type": "word",
"position": 37
}
]
}
索引数据:
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "ngram",
"min_gram": 1,
"max_gram": 20,
"token_chars": [
"letter",
"digit"
]
}
}
},
"max_ngram_diff": 50
},
"mappings": {
"properties": {
"body": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
{
"body": "company name"
}
{
"query": {
"match": {
"body": "ame"
}
}
}
"hits": [
{
"_index": "64975316",
"_type": "_doc",
"_id": "1",
"_score": 1.941854,
"_source": {
"body": "company name"
}
}
]
搜索查询:
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "ngram",
"min_gram": 1,
"max_gram": 20,
"token_chars": [
"letter",
"digit"
]
}
}
},
"max_ngram_diff": 50
},
"mappings": {
"properties": {
"body": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
{
"body": "company name"
}
{
"query": {
"match": {
"body": "ame"
}
}
}
"hits": [
{
"_index": "64975316",
"_type": "_doc",
"_id": "1",
"_score": 1.941854,
"_source": {
"body": "company name"
}
}
]
搜索结果:
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "ngram",
"min_gram": 1,
"max_gram": 20,
"token_chars": [
"letter",
"digit"
]
}
}
},
"max_ngram_diff": 50
},
"mappings": {
"properties": {
"body": {
"type": "text",
"analyzer": "my_analyzer"
}
}
}
}
{
"body": "company name"
}
{
"query": {
"match": {
"body": "ame"
}
}
}
"hits": [
{
"_index": "64975316",
"_type": "_doc",
"_id": "1",
"_score": 1.941854,
"_source": {
"body": "company name"
}
}
]
@harish kumar您是否有机会查看我的答案,期待得到您的反馈:)是的,它正在工作,但如果我们有过滤器,我们可以使用它吗?还有一个问题突出显示搜索我的应用程序搜索对我们的工作方式是,如果我们通过pageSize,它将只从记录中返回一个匹配字段。例如,有4个rec单词[{“姓名”:“jhon”,“年龄”:24,“姓氏”:“willam”,“名字”:“henry”},{“姓名”:“kellin”,“年龄”:24,“姓氏”:“kevin”,“姓氏”:“mathew”},{“姓名”:“Keep”,“年龄”:24,“姓氏”:“Jr”,“姓氏”:“gomez”},{“姓名”:“Asif”,“年龄”:24,“姓氏”:“彼得”,“姓氏”:“willaim kemp”}]用户使用“ke”结果进行搜索时,将类似于{“结果”:[{“fieldId”:“name”,“matchingText”:“Kellin”},{“fieldId”:“name”,“matchingText”:“keeper”},{“fieldId”:“name”,”matchingText:“willaim kemp”}],“total_count”:4,//在es索引“page”中匹配的记录总数:1,“pageSize”:3//每页无记录总数}