为什么ActiveMQ群集会因“失败”而失败；服务器空"；Zookeeper主节点何时脱机？_Activemq_Apache Zookeeper_Leveldb

为什么ActiveMQ群集会因“失败”而失败；服务器空"；Zookeeper主节点何时脱机？

activemq apache-zookeeper

为什么ActiveMQ群集会因“失败”而失败；服务器空"；Zookeeper主节点何时脱机？,activemq,apache-zookeeper,leveldb,Activemq,Apache Zookeeper,Leveldb,我在ActiveMQ中遇到了一个问题，当主Zookeeper节点脱机时，整个集群将失败我们在开发环境中有一个3节点的ActiveMQ集群设置。每个节点都有ActiveMQ 5.12.0和Zookeeper 3.4.6（*注意，我们已经使用Zookeeper 3.4.7进行了一些测试，但这并没有解决问题。到目前为止，时间限制使我们无法测试ActiveMQ 5.13）我们发现，当我们停止主ZooKeeper进程（通过任务管理器中的“结束进程树”命令）时，其余两个ZooKeeper节点继续正常工作

我在ActiveMQ中遇到了一个问题，当主Zookeeper节点脱机时，整个集群将失败

我们在开发环境中有一个3节点的ActiveMQ集群设置。每个节点都有ActiveMQ 5.12.0和Zookeeper 3.4.6（*注意，我们已经使用Zookeeper 3.4.7进行了一些测试，但这并没有解决问题。到目前为止，时间限制使我们无法测试ActiveMQ 5.13）

我们发现，当我们停止主ZooKeeper进程（通过任务管理器中的“结束进程树”命令）时，其余两个ZooKeeper节点继续正常工作。有时ActiveMQ集群能够处理这个问题，但有时却不能

当集群失败时，我们通常会在ActiveMQ日志中看到：

2015-12-18 09:08:45,157 | WARN  | Too many cluster members are connected.  Expected at most 3 members but there are 4 connected. | org.apache.activemq.leveldb.replicated.MasterElector | WrapperSimpleAppMain-EventThread
...
...
2015-12-18 09:27:09,722 | WARN  | Session 0x351b43b4a560016 for server null, unexpected error, closing socket connection and attempting reconnect | org.apache.zookeeper.ClientCnxn | WrapperSimpleAppMain-SendThread(192.168.0.10:2181)
java.net.ConnectException: Connection refused: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)[:1.7.0_79]
at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)[:1.7.0_79]
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)[zookeeper-3.4.6.jar:3.4.6-1569965]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)[zookeeper-3.4.6.jar:3.4.6-1569965]

我们立即感到担忧的是：（A）ActiveMQ似乎认为集群中只有四个成员，而它只配置了3个；（B）当引发异常时，服务器似乎为空。然后，我们将ActiveMQ的日志记录级别提高到DEBUG，以便显示成员列表：

2015-12-18 09:33:04,236 | DEBUG | ZooKeeper group changed: Map(localhost -> ListBuffer((0000000156,{"id":"localhost","container":null,"address":null,"position":-1,"weight":5,"elected":null}), (0000000157,{"id":"localhost","container":null,"address":null,"position":-1,"weight":1,"elected":null}), (0000000158,{"id":"localhost","container":null,"address":"tcp://192.168.0.11:61619","position":-1,"weight":10,"elected":null}), (0000000159,{"id":"localhost","container":null,"address":null,"position":-1,"weight":10,"elected":null}))) | org.apache.activemq.leveldb.replicated.MasterElector | ActiveMQ BrokerService[localhost] Task-14

有人能提出为什么会发生这种情况和/或提出解决办法吗？我们的配置如下所示：

动物园管理员：

tickTime=2000
dataDir=C:\\zookeeper-3.4.7\\data
clientPort=2181
initLimit=5
syncLimit=2
server.1=192.168.0.10:2888:3888
server.2=192.168.0.11:2888:3888
server.3=192.168.0.12:2888:3888

<persistenceAdapter>    
    <replicatedLevelDB
    directory="activemq-data"
    replicas="3"
    bind="tcp://0.0.0.0:61619"
    zkAddress="192.168.0.11:2181,192.168.0.10:2181,192.168.0.12:2181"
    zkPath="/activemq/leveldb-stores"
    hostname="192.168.0.10"
    weight="5"/>
    //server.2 has a weight of 10, server.3 has a weight of 1
</persistenceAdapter>

ActiveMQ（服务器1）：

tickTime=2000
dataDir=C:\\zookeeper-3.4.7\\data
clientPort=2181
initLimit=5
syncLimit=2
server.1=192.168.0.10:2888:3888
server.2=192.168.0.11:2888:3888
server.3=192.168.0.12:2888:3888

<persistenceAdapter>    
    <replicatedLevelDB
    directory="activemq-data"
    replicas="3"
    bind="tcp://0.0.0.0:61619"
    zkAddress="192.168.0.11:2181,192.168.0.10:2181,192.168.0.12:2181"
    zkPath="/activemq/leveldb-stores"
    hostname="192.168.0.10"
    weight="5"/>
    //server.2 has a weight of 10, server.3 has a weight of 1
</persistenceAdapter>


//服务器.2的权重为10，服务器.3的权重为1