Apache zookeeper 是否可以避免嵌套的RetryLoop.callWithRetry调用,以便获得一致的超时?

Apache zookeeper 是否可以避免嵌套的RetryLoop.callWithRetry调用,以便获得一致的超时?,apache-zookeeper,apache-curator,Apache Zookeeper,Apache Curator,我已经使用BoundedExponentialBackoffRetry配置了一个合理的超时,通常,当我调用“create.forPath”时,如果ZK关闭,它会像我预期的那样工作。但如果在进程间ReadWriteLock上调用acquire时ZK不可用,则需要更长的时间才能最终超时 我调用acquire,它被包装在“RetryLoop.callWithRetry”中,它进入到调用FindProtectedNodeInfo,它也被包装在“RetryLoop.callWithRetry”中。如果我已

我已经使用BoundedExponentialBackoffRetry配置了一个合理的超时,通常,当我调用“create.forPath”时,如果ZK关闭,它会像我预期的那样工作。但如果在进程间ReadWriteLock上调用acquire时ZK不可用,则需要更长的时间才能最终超时

我调用acquire,它被包装在“RetryLoop.callWithRetry”中,它进入到调用FindProtectedNodeInfo,它也被包装在“RetryLoop.callWithRetry”中。如果我已将BoundedExponentialBackoffRetry配置为重试20次,则内部重试会对20个外部重试循环中的每一个循环重试20次,因此它会重试400次

我们真的需要一个一致的超时,之后我们就会失败。我在这件事上做错了什么吗?如果没有,我想我会在一个新线程中调用麻烦的方法,我可以在自己超时后杀死它

下面是重新创建它的示例代码。我在注释后面的行上粘贴断点,将ZK向下移动,然后让它继续,并在尝试时获取堆栈跟踪

public class GoCurator {
public static void main(String[] args) throws Exception {

    CuratorFramework cf = CuratorFrameworkFactory.newClient(
            "localhost:2181",
            new BoundedExponentialBackoffRetry(200, 10000, 20)
    );
    cf.start();

    String root = "/myRoot";
    if(cf.checkExists().forPath(root) == null) {
        // Stacktrace A showing what happens if ZK is down for this call
        cf.create().forPath(root);
    }

    InterProcessReadWriteLock lcok = new InterProcessReadWriteLock(cf, "/grant/myLock");

    // See stacktrace B showing the nested re-try if ZK is down for this call
    lcok.readLock().acquire();

    lcok.readLock().release();

    System.out.println("done");
}
}

Stacktrace A(如果在调用create().forPath时ZK关闭)。这显示了单次重试循环,以便在正确的尝试次数后它仍然存在:

  java.lang.Thread.State: WAITING
  at java.lang.Object.wait(Object.java:-1)
  at java.lang.Object.wait(Object.java:502)
  at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1499)
  at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1487)
  at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:2617)
  at org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:242)
  at org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:231)
  at org.apache.curator.connection.StandardConnectionHandlingPolicy.callWithRetry(StandardConnectionHandlingPolicy.java:64)
  at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:100)
  at org.apache.curator.framework.imps.GetChildrenBuilderImpl.pathInForeground(GetChildrenBuilderImpl.java:228)
  at org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:219)
  at org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:41)
  at com.gebatech.curator.GoCurator.main(GoCurator.java:25)
Stacktrace B(如果调用进程间ReadWriteLock#readLock#acquire时ZK关闭)。这显示了嵌套的重试循环,因此它在20*20次尝试之前不会退出

  java.lang.Thread.State: WAITING
  at sun.misc.Unsafe.park(Unsafe.java:-1)
  at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
  at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
  at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
  at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
  at org.apache.curator.CuratorZookeeperClient.internalBlockUntilConnectedOrTimedOut(CuratorZookeeperClient.java:434)
  at org.apache.curator.connection.StandardConnectionHandlingPolicy.callWithRetry(StandardConnectionHandlingPolicy.java:56)
  at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:100)
  at org.apache.curator.framework.imps.CreateBuilderImpl.findProtectedNodeInForeground(CreateBuilderImpl.java:1239)
  at org.apache.curator.framework.imps.CreateBuilderImpl.access$1700(CreateBuilderImpl.java:51)
  at org.apache.curator.framework.imps.CreateBuilderImpl$17.call(CreateBuilderImpl.java:1167)
  at org.apache.curator.framework.imps.CreateBuilderImpl$17.call(CreateBuilderImpl.java:1156)
  at org.apache.curator.connection.StandardConnectionHandlingPolicy.callWithRetry(StandardConnectionHandlingPolicy.java:64)
  at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:100)
  at org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:1153)
  at org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:607)
  at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:597)
  at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:575)
  at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:51)
  at org.apache.curator.framework.recipes.locks.StandardLockInternalsDriver.createsTheLock(StandardLockInternalsDriver.java:54)
  at org.apache.curator.framework.recipes.locks.LockInternals.attemptLock(LockInternals.java:225)
  at org.apache.curator.framework.recipes.locks.InterProcessMutex.internalLock(InterProcessMutex.java:237)
  at org.apache.curator.framework.recipes.locks.InterProcessMutex.acquire(InterProcessMutex.java:89)
  at com.gebatech.curator.GoCurator.main(GoCurator.java:29)

这是馆长如何使用重试的一个真正的、长期存在的问题。我已经准备好了修复和公关:-我希望有更多的人关注它。

这是馆长如何使用重试的一个真正的、长期的问题。我已经准备好了修复和公关:-我希望有更多的人关注它。

我已经为此提出了一个Jira,但它没有被击倒,所以我现在将继续进行一个变通,我已经提出了一个Jira,但它没有被击倒,所以我现在将继续进行变通