Apache zookeeper Apache ZooKeeper群集在领导人选举后失去连接

Apache zookeeper Apache ZooKeeper群集在领导人选举后失去连接,apache-zookeeper,Apache Zookeeper,我运行的ZooKeeper集群有五个节点,每个节点都有以下配置(最后加上正确的仲裁信息): 集群一开始工作正常。出于测试目的,我随后杀死了ZooKeeper的实例,该实例当前是领导者(每个ZooKeeper服务器都在屏幕的前台运行)。然后,如预期的那样,将触发一个领导者选举,另一个节点被选为领导者。 但是,当尝试将get/set/create请求发送到Java客户机中的th集群时(最初运行正常的进程),以及通过zkCli.sh与集群进行接口时,只会将客户机的状态永远显示为连接。 此时,我已执行了

我运行的ZooKeeper集群有五个节点,每个节点都有以下配置(最后加上正确的仲裁信息):

集群一开始工作正常。出于测试目的,我随后杀死了ZooKeeper的实例,该实例当前是领导者(每个ZooKeeper服务器都在
屏幕的前台运行)。然后,如预期的那样,将触发一个领导者选举,另一个节点被选为领导者。
但是,当尝试将
get
/
set
/
create
请求发送到Java客户机中的th集群时(最初运行正常的进程),以及通过
zkCli.sh
与集群进行接口时,只会将客户机的状态永远显示为
连接

此时,我已执行了明显的故障排除步骤,如

  • echo stat | nc localhost 2181
    ——在任何一台仍在运行的服务器上,这只会指示一切正常,例如
  • echo ruok | nc localhost 2181
    仅输出
    iamok
  • 以下是服务器上日志文件的内容(摘要),在我注入故障后,该文件被选为领导者:
我将非常感谢您提供的任何帮助,以解释为什么在领导人选举之后集群似乎没有互连(或者至少不再提供请求),或者为调试目的提供什么建议

tickTime=2000
initLimit=5
syncLimit=2
dataDir=/usr/local/zookeeper/data
clientPort=2181
minSessionTimeout=4000
maxSessionTimeout=40000
4lw.commands.whitelist=*
Clients:
 /10.0.0.1:35264[1](queued=0,recved=2,sent=1)
 /10.0.0.2:49230[0](queued=0,recved=1,sent=0)
 /127.0.0.1:34162[0](queued=0,recved=1,sent=0)
 /10.0.0.3:49530[0](queued=0,recved=1,sent=0)
 /10.0.0.1:35250[0](queued=0,recved=1,sent=0)
 /10.0.0.4:35406[1](queued=0,recved=2,sent=1)
 /10.0.0.2:49304[1](queued=0,recved=1,sent=1)

 Latency min/avg/max: 0/0/0
 Received: 484
 Sent: 140
 Connections: 7
 Outstanding: 343 
 Zxid: 0x200000000
 Mode: leader
 Node count: 15
 Proposal sizes last/min/max: 32/32/75
2019-08-12 12:25:36,799 [myid:4] - INFO  [QuorumPeer[myid=4](plain=/0.0.0.0:2181)(secure=disabled):Follower@69] - FOLLOWING - LEADER ELECTION TOOK - 23 MS
2019-08-12 12:25:37,004 [myid:4] - WARN  [QuorumPeer[myid=4](plain=/0.0.0.0:2181)(secure=disabled):Learner@282] - Unexpected exception, tries=0, remaining init limit=9798, connecting to /10.0.0.10:2888
java.net.ConnectException: Connection refused (Connection refused)
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at org.apache.zookeeper.server.quorum.Learner.sockConnect(Learner.java:233)
    at org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:262)
    at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:77)
    at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1271)
2019-08-12 12:25:39,148 [myid:4] - WARN  [NIOWorkerThread-4:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-08-12 12:25:39,174 [myid:4] - INFO  [NIOWorkerThread-5:FourLetterCommands@234] - The list of known four letter word commands is : [{1936881266=srvr, 1937006964=stat, 2003003491=wchc, 1685417328=dump, 1668445044=crst, 1936880500=srst, 1701738089=envi, 1668247142=conf, -720899=telnet close, 2003003507=wchs, 2003003504=wchp, 1684632179=dirs, 1668247155=cons, 1835955314=mntr, 1769173615=isro, 1920298859=ruok, 1735683435=gtmk, 1937010027=stmk}]
2019-08-12 12:25:39,175 [myid:4] - INFO  [NIOWorkerThread-5:FourLetterCommands@235] - The list of enabled four letter word commands is : [[wchs, stat, wchp, dirs, stmk, conf, ruok, mntr, srvr, wchc, envi, srst, isro, dump, gtmk, telnet close, crst, cons]]
2019-08-12 12:25:39,175 [myid:4] - INFO  [NIOWorkerThread-5:NIOServerCnxn@518] - Processing stat command from /127.0.0.1:59614
2019-08-12 12:25:39,858 [myid:4] - WARN  [NIOWorkerThread-6:NIOServerCnxn@370] - Exception causing close of session 0x0: ZooKeeperServer not running
2019-08-12 12:25:39,906 [myid:4] - INFO  [QuorumPeer[myid=4](plain=/0.0.0.0:2181)(secure=disabled):Learner@391] - Getting a diff from the leader 0x0
2019-08-12 12:25:39,913 [myid:4] - INFO  [QuorumPeer[myid=4](plain=/0.0.0.0:2181)(secure=disabled):Learner@546] - Learner received NEWLEADER message
2019-08-12 12:25:40,253 [myid:4] - INFO  [QuorumPeer[myid=4](plain=/0.0.0.0:2181)(secure=disabled):Learner@529] - Learner received UPTODATE message
2019-08-12 12:25:40,266 [myid:4] - INFO  [QuorumPeer[myid=4](plain=/0.0.0.0:2181)(secure=disabled):CommitProcessor@256] - Configuring CommitProcessor with 16 worker threads.
2019-08-12 12:25:40,536 [myid:4] - WARN  [QuorumPeer[myid=4](plain=/0.0.0.0:2181)(secure=disabled):Follower@125] - Got zxid 0x100000001 expected 0x1
2019-08-12 12:25:40,536 [myid:4] - INFO  [SyncThread:4:FileTxnLog@216] - Creating new log file: log.100000001
2019-08-12 12:25:43,617 [myid:4] - INFO  [NIOWorkerThread-7:NIOServerCnxn@518] - Processing stat command from /127.0.0.1:59638
2019-08-12 12:25:43,619 [myid:4] - INFO  [NIOWorkerThread-7:StatCommand@53] - Stat command output
2019-08-12 12:26:06,907 [myid:4] - WARN  [QuorumPeer[myid=4](plain=/0.0.0.0:2181)(secure=disabled):Follower@96] - Exception when following the leader
java.io.EOFException
    at java.io.DataInputStream.readInt(DataInputStream.java:392)
    at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
    at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:85)
    at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
    at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:158)
    at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:92)
    at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1271)
2019-08-12 12:26:06,908 [myid:4] - WARN  [RecvWorker:5:QuorumCnxManager$RecvWorker@1176] - Connection broken for id 5, my id = 4, error = 
java.io.EOFException
    at java.io.DataInputStream.readInt(DataInputStream.java:392)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1161)
2019-08-12 12:26:06,909 [myid:4] - WARN  [RecvWorker:5:QuorumCnxManager$RecvWorker@1179] - Interrupting SendWorker
2019-08-12 12:26:06,909 [myid:4] - INFO  [QuorumPeer[myid=4](plain=/0.0.0.0:2181)(secure=disabled):Follower@201] - shutdown called
java.lang.Exception: shutdown Follower
    at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:201)
    at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1275)
2019-08-12 12:26:06,910 [myid:4] - WARN  [SendWorker:5:QuorumCnxManager$SendWorker@1092] - Interrupted while waiting for message on queue
java.lang.InterruptedException
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
    at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1243)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:78)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1080)
2019-08-12 12:26:06,910 [myid:4] - INFO  [QuorumPeer[myid=4](plain=/0.0.0.0:2181)(secure=disabled):LearnerZooKeeperServer@165] - Shutting down
2019-08-12 12:26:06,911 [myid:4] - WARN  [SendWorker:5:QuorumCnxManager$SendWorker@1102] - Send worker leaving thread  id 5 my id = 4
2019-08-12 12:26:06,911 [myid:4] - INFO  [QuorumPeer[myid=4](plain=/0.0.0.0:2181)(secure=disabled):ZooKeeperServer@558] - shutting down
2019-08-12 12:26:06,912 [myid:4] - INFO  [QuorumPeer[myid=4](plain=/0.0.0.0:2181)(secure=disabled):FollowerRequestProcessor@139] - Shutting down
2019-08-12 12:26:06,912 [myid:4] - INFO  [QuorumPeer[myid=4](plain=/0.0.0.0:2181)(secure=disabled):CommitProcessor@362] - Shutting down
2019-08-12 12:26:06,912 [myid:4] - INFO  [FollowerRequestProcessor:4:FollowerRequestProcessor@110] - FollowerRequestProcessor exited loop!
2019-08-12 12:26:06,914 [myid:4] - INFO  [CommitProcessor:4:CommitProcessor@195] - CommitProcessor exited loop!
2019-08-12 12:26:06,917 [myid:4] - INFO  [QuorumPeer[myid=4](plain=/0.0.0.0:2181)(secure=disabled):FinalRequestProcessor@514] - shutdown of request processor complete
2019-08-12 12:26:06,922 [myid:4] - INFO  [QuorumPeer[myid=4](plain=/0.0.0.0:2181)(secure=disabled):SyncRequestProcessor@191] - Shutting down
2019-08-12 12:26:06,923 [myid:4] - INFO  [SyncThread:4:SyncRequestProcessor@169] - SyncRequestProcessor exited!
2019-08-12 12:26:06,923 [myid:4] - WARN  [QuorumPeer[myid=4](plain=/0.0.0.0:2181)(secure=disabled):QuorumPeer@1318] - PeerState set to LOOKING
2019-08-12 12:26:06,924 [myid:4] - INFO  [QuorumPeer[myid=4](plain=/0.0.0.0:2181)(secure=disabled):QuorumPeer@1193] - LOOKING
2019-08-12 12:26:06,925 [myid:4] - INFO  [QuorumPeer[myid=4](plain=/0.0.0.0:2181)(secure=disabled):FastLeaderElection@885] - New election. My id =  4, proposed zxid=0x100000024
2019-08-12 12:26:07,129 [myid:4] - WARN  [WorkerSender[myid=4]:QuorumCnxManager@677] - Cannot open channel to 5 at election address /10.0.0.10:3888
java.net.ConnectException: Connection refused (Connection refused)
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:648)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:705)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:618)
    at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:477)
    at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:456)
    at java.lang.Thread.run(Thread.java:748)
2019-08-12 12:26:07,337 [myid:4] - INFO  [WorkerSender[myid=4]:QuorumCnxManager@430] - Have smaller server identifier, so dropping the connection: (5, 4)
2019-08-12 12:26:07,337 [myid:4] - INFO  [WorkerReceiver[myid=4]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x100000024 (n.zxid), 0x2 (n.round), LOOKING (n.state), 3 (n.sid), 0x1 (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-08-12 12:26:07,338 [myid:4] - INFO  [WorkerReceiver[myid=4]:FastLeaderElection@679] - Notification: 2 (message format version), 4 (n.leader), 0x100000024 (n.zxid), 0x2 (n.round), LOOKING (n.state), 4 (n.sid), 0x1 (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-08-12 12:26:07,338 [myid:4] - INFO  [WorkerReceiver[myid=4]:FastLeaderElection@679] - Notification: 2 (message format version), 1 (n.leader), 0x100000024 (n.zxid), 0x2 (n.round), LOOKING (n.state), 1 (n.sid), 0x1 (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-08-12 12:26:07,338 [myid:4] - INFO  [WorkerReceiver[myid=4]:FastLeaderElection@679] - Notification: 2 (message format version), 4 (n.leader), 0x100000024 (n.zxid), 0x2 (n.round), LOOKING (n.state), 3 (n.sid), 0x1 (n.peerEPoch), LOOKING (my state)0 (n.config version)
2019-08-12 12:26:07,342 [myid:4] - INFO  [/0.0.0.0:3888:QuorumCnxManager$Listener@888] - Received connection request /172.17.0.6:46520
2019-08-12 12:26:07,345 [myid:4] - INFO  [/0.0.0.0:3888:QuorumCnxManager$Listener@888] - Received connection request /172.17.0.6:46522
2019-08-12 12:26:07,345 [myid:4] - WARN  [SendWorker:4:QuorumCnxManager$SendWorker@1092] - Interrupted while waiting for message on queue
java.lang.InterruptedException
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
    at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:418)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1243)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager.access$700(QuorumCnxManager.java:78)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1080)
2019-08-12 12:44:00,079 [myid:4] - WARN  [RecvWorker:2:QuorumCnxManager$RecvWorker@1176] - Connection broken for id 2, my id = 4, error = 
java.net.SocketException: Socket closed
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
    at java.net.SocketInputStream.read(SocketInputStream.java:171)
    at java.net.SocketInputStream.read(SocketInputStream.java:141)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
    at java.io.DataInputStream.readInt(DataInputStream.java:387)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1161)

2019-08-12 12:44:43,382 [myid:4] - INFO  [SessionTracker:QuorumZooKeeperServer@157] - Submitting global closeSession request for session 0x4003ead876c0025
2019-08-12 12:44:45,381 [myid:4] - INFO  [SessionTracker:ZooKeeperServer@398] - Expiring session 0x1003ead86e90025, timeout of 40000ms exceeded
2019-08-12 12:44:45,381 [myid:4] - INFO  [SessionTracker:QuorumZooKeeperServer@157] - Submitting global closeSession request for session 0x1003ead86e90025
2019-08-12 12:45:00,485 [myid:4] - INFO  [/0.0.0.0:3888:QuorumCnxManager$Listener@888] - Received connection request /10.0.0.7:42596
2019-08-12 12:45:00,485 [myid:4] - WARN  [RecvWorker:4:QuorumCnxManager$RecvWorker@1176] - Connection broken for id 4, my id = 4, error = 
java.io.EOFException
    at java.io.DataInputStream.readInt(DataInputStream.java:392)
    at org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:1161)
2019-08-12 12:45:00,485 [myid:4] - WARN  [SendWorker:2:QuorumCnxManager$SendWorker@1092] - Interrupted while waiting for message on queue