Java SOLR使用太多内存（第2部分）_Java_Spring_Solr_Lucene

Java SOLR使用太多内存（第2部分）

java spring solr lucene

Java SOLR使用太多内存（第2部分）,java,spring,solr,lucene,Java,Spring,Solr,Lucene,这与这个问题基本相同，但没有有用的答案，情况略有不同：我们正在Windows 2008 R2上运行SOLR 5.5.0，JDK版本为1.8.0_77-b03。当运行我们的索引过程时，运行SOLR的java过程有一个私有工作集，它最终使用了盒子上的所有8GB内存我们正在使用我们使用SOLRJ客户端编写的Spring批处理启动程序为3M+文档编制索引。这是为我们收集的文档编制索引的代码： log.info("Adding " + docList.size() + " documents

这与这个问题基本相同，但没有有用的答案，情况略有不同：

我们正在Windows 2008 R2上运行SOLR 5.5.0，JDK版本为1.8.0_77-b03。当运行我们的索引过程时，运行SOLR的java过程有一个私有工作集，它最终使用了盒子上的所有8GB内存

我们正在使用我们使用SOLRJ客户端编写的Spring批处理启动程序为3M+文档编制索引。这是为我们收集的文档编制索引的代码：

    log.info("Adding " + docList.size() + " documents to Solr index");
    if(docList.size() == 0) {
        log.warn("Was asked to index 0 records, but input size was " + items.size());
    } else {
        log.debug("Splitting list of size " + docList.size() + " into manageable chunks of " + batchCommitSize);
        List<List<SolrInputDocument>> partitionedList = Lists.partition(docList, batchCommitSize);

        SolrClient solrClient = (SolrClient) applicationContext.getBean("solrClient");

        for (List<SolrInputDocument> chewableChunk : partitionedList) {
            solrClient.add(chewableChunk);
            solrClient.commit();
            log.info(chewableChunk.size() + " documents committed.");
        }

        log.info("Finished batch indexing of " + docList.size() + " documents.");
    }

这是我们的模式定义。它真的很长，所以我只是剪切并粘贴带有字段定义的部分。如果需要，我可以上传更多。其中大部分是从教程中的示例配置复制的

<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
<field name="_version_" type="long" indexed="true" stored="true"/>
<field name="_root_" type="string" indexed="true" stored="false"/>
<field name="_text_" type="text_general" indexed="true" stored="false" multiValued="true"/>
<copyField source="*" dest="_text_"/>

<field name="fileName" type="string" indexed="true" stored="true" required="true"/>
<field name="projectName" type="string" indexed="true" stored="true" required="true"/>
<field name="lastCommitAuthor" type="string" indexed="true" stored="true"/>
<field name="vcsUrl" type="string" indexed="true" stored="true"/>
<field name="teamCityUrl" type="string" indexed="true" stored="true"/>
<field name="jenkinsUrl" type="string" indexed="true" stored="true"/>
<field name="content" type="text_general" indexed="true" stored="true" required="true"/>
<field name="relativePath" type="string" indexed="true" stored="true" required="true"/>

<!-- Field to use to determine and enforce document uniqueness.
  Unless this field is marked with required="false", it will be a required field
-->
<uniqueKey>id</uniqueKey>


身份证件

前面的问题指出内存映射文件可能是罪魁祸首，但我们无法找到一种方法来关闭它。我们还尝试在每次提交时关闭并重新创建客户端

有什么方法可以减少SOLR在索引时使用的内存量吗？

我知道如何关闭

mmapcache

。在

solrConfig.xml

中搜索

directoryFactory

，并用下面给出的标记替换现有标记

这将关闭MMAP文件：

<directoryFactory name="DirectoryFactory"
class="${solr.directoryFactory:solr.SimpleFSDirectoryFactory.}"/>

由于此更改，您将无法进行近实时搜索。

我知道如何关闭

mmapcache

。在

solrConfig.xml

中搜索

directoryFactory

，并用下面给出的标记替换现有标记

这将关闭MMAP文件：

<directoryFactory name="DirectoryFactory"
class="${solr.directoryFactory:solr.SimpleFSDirectoryFactory.}"/>

由于此更改，您将无法进行近实时搜索。

您为Solr进程分配了多少内存？4GB。内存选项为

-XX:+UseG1GC^-XX:SurvivorRatio=4^-XX:+useStringDuplication-XX:+AggressiveHeap

您为Solr进程分配了多少内存？4GB。内存选项为

-XX:+UseG1GC^-XX:SurvivorRatio=4^-XX:+useStringDuplication-XX:+AggressiveHeap