Grails 可搜索索引在手动更新时被锁定(LockActainFailedException)

Grails 可搜索索引在手动更新时被锁定(LockActainFailedException),grails,compass-lucene,grails-searchable,searchable-plugin,Grails,Compass Lucene,Grails Searchable,Searchable Plugin,我们有一个运行在负载平衡器后面的Grails项目。服务器上运行的Grails应用程序有三个实例(使用单独的Tomcat实例)。每个实例都有自己的可搜索索引。因为索引是分开的,所以自动更新不足以保持应用程序实例之间的索引一致。因此,我们禁用了可搜索索引镜像,并在计划的quartz作业中手动更新索引。根据我们的理解,应用程序的任何其他部分都不应该修改索引 quartz作业每分钟运行一次,它从数据库中检查应用程序更新了哪些行,并重新索引这些对象。该作业还检查同一作业是否已在运行,以便不执行任何并发索引

我们有一个运行在负载平衡器后面的Grails项目。服务器上运行的Grails应用程序有三个实例(使用单独的Tomcat实例)。每个实例都有自己的可搜索索引。因为索引是分开的,所以自动更新不足以保持应用程序实例之间的索引一致。因此,我们禁用了可搜索索引镜像,并在计划的quartz作业中手动更新索引。根据我们的理解,应用程序的任何其他部分都不应该修改索引

quartz作业每分钟运行一次,它从数据库中检查应用程序更新了哪些行,并重新索引这些对象。该作业还检查同一作业是否已在运行,以便不执行任何并发索引。应用程序在启动后正常运行数小时,然后在作业启动时突然抛出LockGetainFailedException:

22.10.2012 11:20:40[xxxx.ReindexJob]错误无法更新可搜索索引,类org.compass.core.engine.SearchEngineeException: 无法打开子索引[产品]的写入程序;嵌套异常是 org.apache.lucene.store.lockOcctainFailedException:锁获取已计时 输出: SimpleFSLock@/home/xxx/tomcat/searchable index/index/product/lucene-a7bbc72a49512284f5ac54f5d7d32849-write.lock

根据上次执行作业时的日志,重新编制索引时没有出现任何错误,作业成功完成。不过,这次重新索引操作会抛出锁定异常,就好像上一次操作未完成且锁尚未释放一样。在应用程序重新启动之前,不会释放锁

我们试图通过手动打开锁定的索引来解决此问题,这导致以下错误被打印到日志中:

22.10.2012 11:21:30[manager.IndexWritersManager]错误非法状态,将一个索引编写器标记为打开,而另一个标记为打开 打开子索引[产品]

在此之后,作业似乎工作正常,不会再次陷入锁定状态。但是,这会导致应用程序持续使用100%的CPU资源。下面是quartz作业代码的缩短版本

如果您能帮助解决此问题,我们将不胜感激,谢谢

class ReindexJob {

def compass
...

static Calendar lastIndexed

static triggers = {
    // Every day every minute (at xx:xx:30), start delay 2 min
    // cronExpression:                           "s  m h D M W [Y]"
    cron name: "ReindexTrigger", cronExpression: "30 * * * * ?", startDelay: 120000
}

def execute() {
    if (ConcurrencyHelper.isLocked(ConcurrencyHelper.Locks.LUCENE_INDEX)) {
        log.error("Search index has been locked, not doing anything.")
        return
    }

    try {
        boolean acquiredLock = ConcurrencyHelper.lock(ConcurrencyHelper.Locks.LUCENE_INDEX, "ReindexJob")
        if (!acquiredLock) {
            log.warn("Could not lock search index, not doing anything.")
            return
        }

        Calendar reindexDate = lastIndexed
        Calendar newReindexDate = Calendar.instance
        if (!reindexDate) {
            reindexDate = Calendar.instance
            reindexDate.add(Calendar.MINUTE, -3)
            lastIndexed = reindexDate
        }

        log.debug("+++ Starting ReindexJob, last indexed ${TextHelper.formatDate("yyyy-MM-dd HH:mm:ss", reindexDate.time)} +++")
        Long start = System.currentTimeMillis()

        String reindexMessage = ""

        // Retrieve the ids of products that have been modified since the job last ran
        String productQuery = "select p.id from Product ..."
        List<Long> productIds = Product.executeQuery(productQuery, ["lastIndexedDate": reindexDate.time, "lastIndexedCalendar": reindexDate])

        if (productIds) {
            reindexMessage += "Found ${productIds.size()} product(s) to reindex. "

            final int BATCH_SIZE = 10
            Long time = TimeHelper.timer {
                for (int inserted = 0; inserted < productIds.size(); inserted += BATCH_SIZE) {
                    log.debug("Indexing from ${inserted + 1} to ${Math.min(inserted + BATCH_SIZE, productIds.size())}: ${productIds.subList(inserted, Math.min(inserted + BATCH_SIZE, productIds.size()))}")
                    Product.reindex(productIds.subList(inserted, Math.min(inserted + BATCH_SIZE, productIds.size())))
                    Thread.sleep(250)
                }
            }

            reindexMessage += " (${time / 1000} s). "
        } else {
            reindexMessage += "No products to reindex. "
        }

        log.debug(reindexMessage)

        // Re-index brands
        Brand.reindex()

        lastIndexed = newReindexDate

        log.debug("+++ Finished ReindexJob (${(System.currentTimeMillis() - start) / 1000} s) +++")
    } catch (Exception e) {
        log.error("Could not update searchable index, ${e.class}: ${e.message}")
        if (e instanceof org.apache.lucene.store.LockObtainFailedException || e instanceof org.compass.core.engine.SearchEngineException) {
            log.info("This is a Lucene index locking exception.")
            for (String subIndex in compass.searchEngineIndexManager.getSubIndexes()) {
                if (compass.searchEngineIndexManager.isLocked(subIndex)) {
                    log.info("Releasing Lucene index lock for sub index ${subIndex}")
                    compass.searchEngineIndexManager.releaseLock(subIndex)
                }
            }
        }
    } finally {
        ConcurrencyHelper.unlock(ConcurrencyHelper.Locks.LUCENE_INDEX, "ReindexJob")
    }
}
}
类重新索引作业{
def罗盘
...
静态日历索引
静态触发器={
//每天每分钟(xx:xx:30),开始延迟2分钟
//cronExpression:“s m h D m W[Y]”
cron名称:“ReindexTrigger”,cron表达式:“30****?”,起始时间:120000
}
def execute(){
if(ConcurrencyHelper.isLocked(ConcurrencyHelper.Locks.LUCENE_INDEX)){
错误(“搜索索引已被锁定,未执行任何操作。”)
返回
}
试一试{
布尔acquiredLock=ConcurrencyHelper.lock(ConcurrencyHelper.Locks.LUCENE_索引,“ReindexJob”)
如果(!acquiredLock){
log.warn(“无法锁定搜索索引,未执行任何操作。”)
返回
}
日历重新索引日期=上次索引
Calendar newReindexDate=Calendar.instance
如果(!重新索引日期){
reindexDate=Calendar.instance
添加(Calendar.MINUTE,-3)
lastIndexed=重新索引日期
}
log.debug(“+++正在启动ReindexJob,最后一个索引${TextHelper.formatDate”(“yyyy-MM-dd HH:MM:ss”,reindexDate.time)}++”)
长启动=System.currentTimeMillis()
字符串reindexMessage=“”
//检索自上次运行作业以来已修改的产品的ID
String productQuery=“从产品中选择p.id…”
List productId=Product.executeQuery(productQuery,[“lastIndexedDate”:reindexDate.time,“lastIndexedCalendar”:reindexDate])
if(产品ID){
reindexMessage+=“找到要重新索引的${productIds.size()}个产品。”
最终整数批次大小=10
长时间=TimeHelper.timer{
for(int inserted=0;inserted
基于JMX CPU示例,Compass似乎在幕后进行一些调度。从1分钟的CPU示例来看,在比较正常和100%CPU实例时,似乎没有什么不同:

  • org.apach
    compassSettings = [
    'compass.engine.optimizer.schedule.period': '300',
    'compass.engine.mergeFactor':'1000',
    'compass.engine.maxBufferedDocs':'1000',
    'compass.engine.ramBufferSize': '128',
    'compass.engine.useCompoundFile': 'false',
    'compass.transaction.processor': 'read_committed',
    'compass.transaction.processor.read_committed.concurrentOperations': 'false',
    'compass.transaction.lockTimeout': '30',
    'compass.transaction.lockPollInterval': '500',
    'compass.transaction.readCommitted.translog.connection': 'ram://'
    ]