elasticsearch,elastic-stack,Java,elasticsearch,Elastic Stack" /> elasticsearch,elastic-stack,Java,elasticsearch,Elastic Stack" />

ElasticSearch-在我的输入查询中没有(*)的情况下,JavaApi搜索不会发生

ElasticSearch-在我的输入查询中没有(*)的情况下,JavaApi搜索不会发生,java,elasticsearch,elastic-stack,Java,elasticsearch,Elastic Stack,我使用JavaAPI从弹性搜索中获取文档,我的弹性搜索文档中有以下code,并尝试使用以下模式进行搜索 代码:MS-VMA1615-0D Input : *VMA1615-0* -- Am getting the results (MS-VMA1615-0D). Input : MS-VMA1615-0D -- Am getting the results (MS-VMA1615-0D). Input : *VMA1615-0 -- Am getting the resul

我使用JavaAPI从弹性搜索中获取文档,我的弹性搜索文档中有以下
code
,并尝试使用以下模式进行搜索

代码:MS-VMA1615-0D

Input : *VMA1615-0*     -- Am getting the results (MS-VMA1615-0D).
Input : MS-VMA1615-0D   -- Am getting the results (MS-VMA1615-0D).
Input : *VMA1615-0      -- Am getting the results (MS-VMA1615-0D).
Input : *VMA*-0*        -- Am getting the results (MS-VMA1615-0D).
但是,如果我像下面这样输入,我不会得到结果

Input : VMA1615         -- Am not getting the results.
我希望返回代码
MS-VMA1615-0D

Input : *VMA1615-0*     -- Am getting the results (MS-VMA1615-0D).
Input : MS-VMA1615-0D   -- Am getting the results (MS-VMA1615-0D).
Input : *VMA1615-0      -- Am getting the results (MS-VMA1615-0D).
Input : *VMA*-0*        -- Am getting the results (MS-VMA1615-0D).
请找到我下面使用的java代码

private final String INDEX = "products";
private final String TYPE = "doc";
SearchRequest searchRequest = new SearchRequest(INDEX); 
    searchRequest.types(TYPE);
    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
    QueryStringQueryBuilder qsQueryBuilder = new QueryStringQueryBuilder(code); 

    qsQueryBuilder.defaultField("code");
    searchSourceBuilder.query(qsQueryBuilder);

    searchSourceBuilder.size(50);
    searchRequest.source(searchSourceBuilder);
    SearchResponse searchResponse = null;
    try {
         searchResponse = SearchEngineClient.getInstance().search(searchRequest);
    } catch (IOException e) {
        e.getLocalizedMessage();
    }
    Item item = null;
    SearchHit[] searchHits = searchResponse.getHits().getHits();
请查找我的映射详细信息:

PUT products
{
"settings": {
"analysis": {
  "analyzer": {
    "custom_analyzer": {
      "type": "custom",
      "tokenizer": "whitespace",
      "char_filter": [
        "html_strip"
      ],
      "filter": [
        "lowercase",
        "asciifolding"
      ]
    }
   }
  }
},
"mappings": {
"doc": {
  "properties": {
    "code": {
      "type": "text",
       "analyzer": "custom_analyzer"
      }
       }
  }
 }
}

要执行您正在寻找的操作,您可能必须更改您正在使用的标记器。当前您使用的是空白标记器,必须将其替换为模式标记器。 因此,您的新映射应该如下所示:

PUT products
{
"settings": {
"analysis": {
  "analyzer": {
    "custom_analyzer": {
      "type": "custom",
      "tokenizer": "pattern",
      "char_filter": [
        "html_strip"
      ],
      "filter": [
        "lowercase",
        "asciifolding"
      ]
    }
   }
  }
},
"mappings": {
"doc": {
  "properties": {
    "code": {
      "type": "text",
       "analyzer": "custom_analyzer"
      }
    }
  }
 }
}
因此,更改映射后,对VMA1615的查询将返回MS-VMA1615-0D

这是因为它将字符串“MS-VMA1615-0D”标记为“MS”、“VMA1615”和“0D”。因此,无论何时在您的查询中有任何一个,它都会给出结果

POST _analyze
{
  "tokenizer": "pattern",
  "text": "MS-VMA1615-0D"
}
将返回:

{
  "tokens": [
    {
      "token": "MS",
      "start_offset": 0,
      "end_offset": 2,
      "type": "word",
      "position": 0
    },
    {
      "token": "VMA1615",
      "start_offset": 3,
      "end_offset": 10,
      "type": "word",
      "position": 1
    },
    {
      "token": "0D",
      "start_offset": 11,
      "end_offset": 13,
      "type": "word",
      "position": 2
    }
  ]
}
根据您的评论:

这不是elasticsearch的工作方式。Elasticsearch存储术语和 它们对应的文档采用倒排索引的数据结构和 默认情况下,全文搜索生成的术语基于 空白,即“你好,我是技术官僚”的文本将被拆分 作为[“嗨”,“那里”,“我”,“是”,“a”,“技术官僚”]。这就意味着 存储的术语取决于其标记方式。之后 当您查询时,比如在上面的示例中,如果我查询 “技术官僚”,我会得到这个结果,因为反向指数有这个结果 与我的文档关联的术语。因此,在您的情况下,“VMA”不是作为术语存储的

为此,请使用以下映射:

PUT products
{
"settings": {
"analysis": {
  "analyzer": {
    "custom_analyzer": {
      "type": "custom",
      "tokenizer": "my_pattern_tokenizer",
      "char_filter": [
        "html_strip"
      ],
      "filter": [
        "lowercase",
        "asciifolding"
      ]
    }
   },
   "tokenizer": {
     "my_pattern_tokenizer": {
          "type": "pattern",
          "pattern": "-|\\d"
        }
   }
  }
},
"mappings": {
"doc": {
  "properties": {
    "code": {
      "type": "text",
       "analyzer": "custom_analyzer"
      }
    }
  }
 }
}
因此,请检查:

POST products/_analyze
{
  "tokenizer": "my_pattern_tokenizer",
  "text": "MS-VMA1615-0D"
}
将产生:

{
  "tokens": [
    {
      "token": "MS",
      "start_offset": 0,
      "end_offset": 2,
      "type": "word",
      "position": 0
    },
    {
      "token": "VMA",
      "start_offset": 3,
      "end_offset": 6,
      "type": "word",
      "position": 1
    },
    {
      "token": "D",
      "start_offset": 12,
      "end_offset": 13,
      "type": "word",
      "position": 2
    }
  ]
}

如果我正在搜索
VMA
,则此模式不起作用。这是否可能使其仅可搜索
VMA
…?是的,如果您不担心VMA后面的数字,即“1616”,这是可能的?如果您不担心数字,那么我可以更改我的答案。让我知道!是的,它像排列和组合。我也以
VMA
的形式提供输入,我希望它会返回与我的输入相关的内容,因为它在弹性搜索中可用。我觉得有道理。更新了答案。