elasticsearch,elastic-stack,Java,elasticsearch,Elastic Stack" /> elasticsearch,elastic-stack,Java,elasticsearch,Elastic Stack" />

使用java api获取超过50k个文档时的ElasticSearch约束

使用java api获取超过50k个文档时的ElasticSearch约束,java,elasticsearch,elastic-stack,Java,elasticsearch,Elastic Stack,Am使用java api查询elasticsearch索引SearchSourceBuilder。我的索引中有超过100k个文档,并且我已经将索引增加了。如果我尝试获取120k个文档,则从我的java代码将最大结果窗口增加到120000。它在下面的行中抛出空指针异常 SearchHit[] searchHits = searchResponse.getHits().getHits(); 如果我将SearchSourceBuilder的大小减小到50k,那么它工作正常,但我只能获取50k文档 请

Am使用java api查询elasticsearch索引
SearchSourceBuilder
。我的索引中有超过
100k
个文档,并且我已经将
索引增加了。如果我尝试获取
120k
个文档,则从我的java代码将最大结果窗口
增加到
120000
。它在下面的行中抛出空指针异常

SearchHit[] searchHits = searchResponse.getHits().getHits();
如果我将
SearchSourceBuilder
的大小减小到
50k
,那么它工作正常,但我只能获取
50k
文档

请在下面找到我的代码:

RestHighLevelClient restHighLevelClient = null;
    Document doc=new Document();

    logger.info("Started Indexing the Document.....");

    try {
        restHighLevelClient = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 9200, "http"),
                new HttpHost("localhost", 9201, "http")));
        System.out.println(e.getMessage());
    }


    //Fetching Id, FilePath & FileName from Document Index. 
    SearchRequest searchRequest = new SearchRequest(INDEX); 
    searchRequest.types(TYPE);
    SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
    QueryBuilder qb = QueryBuilders.matchAllQuery();
    searchSourceBuilder.query(qb);
    searchSourceBuilder.size(120000); 
    searchRequest.source(searchSourceBuilder);
    SearchResponse searchResponse = null;
    try {
         searchResponse = restHighLevelClient.search(searchRequest);
    } catch (IOException e) {
        e.getLocalizedMessage();
    }

    SearchHit[] searchHits = searchResponse.getHits().getHits(); /// Getting null pointer exception after porcessing some documents. Count is not very constant.
    long totalHits=searchResponse.getHits().totalHits;
    logger.info("Total Hits --->"+totalHits);
请查找我的索引设置详细信息

{
  "document_attachment": {
    "settings": {
      "index": {
        "number_of_shards": "5",
        "provided_name": "document_attachment",
        "max_result_window": "150000",
        "creation_date": "1531402811016",
        "analysis": {
          "analyzer": {
            "custom_analyzer": {
              "filter": [
                "lowercase",
                "asciifolding"
              ],
              "char_filter": [
                "html_strip"
              ],
              "type": "custom",
              "tokenizer": "whitespace"
            },
            "product_catalog_keywords_analyzer": {
              "filter": [
                "lowercase",
                "asciifolding"
              ],
              "char_filter": [
                "html_strip"
              ],
              "type": "custom",
              "tokenizer": "whitespace"
            }
          }
        },
        "number_of_replicas": "1",
        "uuid": "UBRQAkg-Su-FfeAtBTGFIw",
        "version": {
          "created": "6020399"
        }
      }
    }
  }
}

您需要使用滚动搜索,而不是试图一次获取所有内容。这使您可以一页一页地浏览结果

通过滚动,您可以获得所需的任意多个结果;没有上限。你将无法获得排名结果t,但在这么大的结果集上这是毫无意义的

请参见如何执行此操作