Java SolrException:无法从ZK加载集合
我有个大问题。我们的市场有7280000种商品。我们用两个实例将它们索引到solr云中。但是最近索引服务经常失败。该服务是用JavaSpark编写的。如果我手动开始这项工作,大多数情况下,他结束时不会出现任何错误Java SolrException:无法从ZK加载集合,java,apache-spark,solr,apache-zookeeper,Java,Apache Spark,Solr,Apache Zookeeper,我有个大问题。我们的市场有7280000种商品。我们用两个实例将它们索引到solr云中。但是最近索引服务经常失败。该服务是用JavaSpark编写的。如果我手动开始这项工作,大多数情况下,他结束时不会出现任何错误 13-04-2021 22:46:44 CEST re_indexer INFO - Lost task 7424.0 in stage 4.0 (TID 21173, 188.34.196.190, executor 1): org.apache.solr.common.SolrEx
13-04-2021 22:46:44 CEST re_indexer INFO - Lost task 7424.0 in stage 4.0 (TID 21173, 188.34.196.190, executor 1): org.apache.solr.common.SolrException: Could not load collection from ZK: model_part
13-04-2021 22:46:44 CEST re_indexer INFO - at org.apache.solr.common.cloud.ZkStateReader.getCollectionLive(ZkStateReader.java:1316)
13-04-2021 22:46:44 CEST re_indexer INFO - at org.apache.solr.common.cloud.ZkStateReader$LazyCollectionRef.get(ZkStateReader.java:732)
13-04-2021 22:46:44 CEST re_indexer INFO - at org.apache.solr.common.cloud.ClusterState$CollectionRef.get(ClusterState.java:386)
13-04-2021 22:46:44 CEST re_indexer INFO - at org.apache.solr.client.solrj.impl.CloudSolrClient.getDocCollection(CloudSolrClient.java:1208)
13-04-2021 22:46:44 CEST re_indexer INFO - at org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:851)
13-04-2021 22:46:44 CEST re_indexer INFO - at org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:819)
13-04-2021 22:46:44 CEST re_indexer INFO - Caused by: org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /collections/model_part/state.json
13-04-2021 22:46:44 CEST re_indexer INFO - at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
13-04-2021 22:46:44 CEST re_indexer INFO - at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
13-04-2021 22:46:44 CEST re_indexer INFO - at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
13-04-2021 22:46:44 CEST re_indexer INFO - at org.apache.solr.common.cloud.SolrZkClient.lambda$getData$5(SolrZkClient.java:341)
我看到zookeeper有一个会话超时,但是增加超时不会改变任何事情
我的索引抽象java代码
JavaRDD<ModelPart> parts = loadParts();
int repartition = (int) Math.max(parts.count() / 100, 10);
JavaRDD<SolrInputDocument> solrDocs = parts
.repartition(repartition)
.mapPartitions(new SolrConverterAggregatedStage())
.persist(StorageLevel.MEMORY_AND_DISK_SER());
SolrClient solrDriverClient = getSolrClient(zkHosts);
solrDocs.foreachPartition(solrInputDocumentIterator -> {
SolrClient solrClient = getSolrClient(zkHosts);
solrClient.add(collection, solrInputDocumentIterator);
});
try {
solrDriverClient.commit(collection, true, false);
solrDriverClient.close();
} catch (SolrServerException | IOException e) {
LOG.warn(e.getMessage(), e);
}
javarddparts=loadParts();
重新分区=(int)Math.max(parts.count()/100,10);
JavaRDD solrDocs=零件
.重新划分(重新划分)
.mapPartitions(新的SolrConverterAggregatedStage())
.persist(StorageLevel.MEMORY_和_DISK_SER());
SolrClient-solrDriverClient=getSolrClient(zkHosts);
foreachPartition(solrInputDocumentIterator->{
SolrClient-SolrClient=getSolrClient(zkHosts);
add(集合,solrInputDocumentIterator);
});
试一试{
提交(collection,true,false);
solrDriverClient.close();
}捕获(SolrServerException | IOE异常){
LOG.warn(e.getMessage(),e);
}
感谢您的帮助。这可能是由GC暂停引起的,GC暂停Solr的时间有点太长。如果可能的话,它可能有助于更改堆大小、升级Solr或查看更新的VM。有关此问题的更多讨论,请参阅。