Java 使用Jest使用自定义分析器创建索引时出现问题
为elasticsearch提供了一个出色的异步API,我们发现它非常有用。然而,有时结果表明,生成的请求与我们预期的略有不同 通常我们都不在乎,因为一切都很好,但在这种情况下就不在乎了 我想用一个定制的ngram分析器创建一个索引。当我按照elasticsearch rest API文档执行此操作时,我调用以下命令:Java 使用Jest使用自定义分析器创建索引时出现问题,java,
elasticsearch,elasticsearch-jest,Java,
elasticsearch,Elasticsearch Jest,为elasticsearch提供了一个出色的异步API,我们发现它非常有用。然而,有时结果表明,生成的请求与我们预期的略有不同 通常我们都不在乎,因为一切都很好,但在这种情况下就不在乎了 我想用一个定制的ngram分析器创建一个索引。当我按照elasticsearch rest API文档执行此操作时,我调用以下命令: curl -XPUT 'localhost:9200/test' --data ' { "settings": { "number_of_shards": 3,
curl -XPUT 'localhost:9200/test' --data '
{
"settings": {
"number_of_shards": 3,
"analysis": {
"filter": {
"keyword_search": {
"type": "edge_ngram",
"min_gram": 3,
"max_gram": 15
}
},
"analyzer": {
"keyword": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase",
"keyword_search"
]
}
}
}
}
}'
然后我使用以下方法确认分析仪配置正确:
curl -XGET 'localhost:9200/test/_analyze?analyzer=keyword&text=Expecting many tokens
作为响应,我收到多个令牌,如exp、expe、expec等等
现在使用Jest客户端,我将配置json放在类路径上的一个文件中,内容与上面put请求的主体完全相同。我执行如下构造的Jest操作:
new CreateIndex.Builder(name)
.settings(
ImmutableSettings.builder()
.loadFromClasspath(
"settings.json"
).build().getAsMap()
).build();
结果
- Primo-通过tcpdump检查,实际发布到elasticsearch的内容是(打印精美):
- Secundo-生成的索引设置为:
{ "test": { "settings": { "index": { "settings": { "analysis": { "filter": { "keyword_search": { "type": "edge_ngram", "min_gram": "3", "max_gram": "15" } }, "analyzer": { "keyword": { "filter": [ "lowercase", "keyword_search" ], "type": "custom", "tokenizer": "whitespace" } } }, "number_of_shards": "3" <-- the only difference from the one created with rest call }, "number_of_shards": "3", "number_of_replicas": "0", "version": {"created": "1030499"}, "uuid": "Glqf6FMuTWG5EH2jarVRWA" } } } }
{ “测试”:{ “设置”:{ “索引”:{ “设置”:{ “分析”:{ “过滤器”:{ “关键字搜索”:{ “类型”:“边缘图”, “最小值”:“3”, “最大重量”:“15” } }, “分析器”:{ “关键字”:{ “过滤器”:[ “小写”, “关键字搜索” ], “类型”:“自定义”, “标记器”:“空白” } } },
“碎片数”:“3”很高兴你发现笑话很有用,请看下面我的答案 问题1。Jest没有发布我的原创作品的原因是什么 设置json,但有些人处理了json 这不是开玩笑,而是Elasticsearch的
这样做,请参见:ImmutableSettings
产出:Map test = ImmutableSettings.builder() .loadFromSource("{\n" + " \"settings\": {\n" + " \"number_of_shards\": 3,\n" + " \"analysis\": {\n" + " \"filter\": {\n" + " \"keyword_search\": {\n" + " \"type\": \"edge_ngram\",\n" + " \"min_gram\": 3,\n" + " \"max_gram\": 15\n" + " }\n" + " },\n" + " \"analyzer\": {\n" + " \"keyword\": {\n" + " \"type\": \"custom\",\n" + " \"tokenizer\": \"whitespace\",\n" + " \"filter\": [\n" + " \"lowercase\",\n" + " \"keyword_search\"\n" + " ]\n" + " }\n" + " }\n" + " }\n" + " }\n" + "}").build().getAsMap(); System.out.println("test = " + test);
问题2。为什么Jest生成的设置不起作用 因为您对设置JSON/map的使用不是预期的情况。我创建此测试是为了重现您的情况(虽然有点长,但请耐心等待): 当您运行它时,您将看到使用test = { settings.analysis.filter.keyword_search.type=edge_ngram, settings.number_of_shards=3, settings.analysis.analyzer.keyword.filter.0=lowercase, settings.analysis.analyzer.keyword.filter.1=keyword_search, settings.analysis.analyzer.keyword.type=custom, settings.analysis.analyzer.keyword.tokenizer=whitespace, settings.analysis.filter.keyword_search.max_gram=15, settings.analysis.filter.keyword_search.min_gram=3 }
的情况,实际设置完全错误(settingsAsMap
包括另一个settings
,这是您的JSON,但它们应该已经合并),因此分析失败 为什么这不是预期用途? 因为这就是Elasticsearch在这种情况下的行为。如果设置数据被展平(默认情况下由settings
类完成),那么它不应该具有顶级元素ImmutableSettings
,但如果数据未展平,它可以具有相同的顶级元素(这就是为什么带有settings
的测试用例可以工作的原因) tl;dr:settingsAsString
您的设置JSON不应包含顶级“settings”元素(如果您通过
运行它).感谢您努力回答我的问题,这一定花了一些时间!我应用了您的建议删除了top settings元素,效果非常好。没问题!请记住,您可以使用原始字符串作为ImmutableSettings
源代码。
test = { settings.analysis.filter.keyword_search.type=edge_ngram, settings.number_of_shards=3, settings.analysis.analyzer.keyword.filter.0=lowercase, settings.analysis.analyzer.keyword.filter.1=keyword_search, settings.analysis.analyzer.keyword.type=custom, settings.analysis.analyzer.keyword.tokenizer=whitespace, settings.analysis.filter.keyword_search.max_gram=15, settings.analysis.filter.keyword_search.min_gram=3 }
@Test public void createIndexTemp() throws IOException { String index = "so_q_26949195"; String settingsAsString = "{\n" + " \"settings\": {\n" + " \"number_of_shards\": 3,\n" + " \"analysis\": {\n" + " \"filter\": {\n" + " \"keyword_search\": {\n" + " \"type\": \"edge_ngram\",\n" + " \"min_gram\": 3,\n" + " \"max_gram\": 15\n" + " }\n" + " },\n" + " \"analyzer\": {\n" + " \"keyword\": {\n" + " \"type\": \"custom\",\n" + " \"tokenizer\": \"whitespace\",\n" + " \"filter\": [\n" + " \"lowercase\",\n" + " \"keyword_search\"\n" + " ]\n" + " }\n" + " }\n" + " }\n" + " }\n" + "}"; Map settingsAsMap = ImmutableSettings.builder() .loadFromSource(settingsAsString).build().getAsMap(); CreateIndex createIndex = new CreateIndex.Builder(index) .settings(settingsAsString) .build(); JestResult result = client.execute(createIndex); assertTrue(result.getErrorMessage(), result.isSucceeded()); GetSettings getSettings = new GetSettings.Builder().addIndex(index).build(); result = client.execute(getSettings); assertTrue(result.getErrorMessage(), result.isSucceeded()); System.out.println("SETTINGS SENT AS STRING settingsResponse = " + result.getJsonString()); Analyze analyze = new Analyze.Builder() .index(index) .analyzer("keyword") .source("Expecting many tokens") .build(); result = client.execute(analyze); assertTrue(result.getErrorMessage(), result.isSucceeded()); Integer actualTokens = result.getJsonObject().getAsJsonArray("tokens").size(); assertTrue("Expected multiple tokens but got " + actualTokens, actualTokens > 1); analyze = new Analyze.Builder() .analyzer("keyword") .source("Expecting single token") .build(); result = client.execute(analyze); assertTrue(result.getErrorMessage(), result.isSucceeded()); actualTokens = result.getJsonObject().getAsJsonArray("tokens").size(); assertTrue("Expected single token but got " + actualTokens, actualTokens == 1); admin().indices().delete(new DeleteIndexRequest(index)).actionGet(); createIndex = new CreateIndex.Builder(index) .settings(settingsAsMap) .build(); result = client.execute(createIndex); assertTrue(result.getErrorMessage(), result.isSucceeded()); getSettings = new GetSettings.Builder().addIndex(index).build(); result = client.execute(getSettings); assertTrue(result.getErrorMessage(), result.isSucceeded()); System.out.println("SETTINGS AS MAP settingsResponse = " + result.getJsonString()); analyze = new Analyze.Builder() .index(index) .analyzer("keyword") .source("Expecting many tokens") .build(); result = client.execute(analyze); assertTrue(result.getErrorMessage(), result.isSucceeded()); actualTokens = result.getJsonObject().getAsJsonArray("tokens").size(); assertTrue("Expected multiple tokens but got " + actualTokens, actualTokens > 1); }