Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/spring/13.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Spring AMQP v1.4.2-网络故障时兔子重新连接问题_Spring_Spring Amqp - Fatal编程技术网

Spring AMQP v1.4.2-网络故障时兔子重新连接问题

Spring AMQP v1.4.2-网络故障时兔子重新连接问题,spring,spring-amqp,Spring,Spring Amqp,我正在Spring AMQP v1.4.2中测试以下场景,在网络中断后无法重新连接: 启动spring应用程序,该应用程序使用rabbit:listener容器和rabbit:connection工厂异步使用消息(详细配置如下) 日志显示应用程序正在成功接收消息 通过在rabbit服务器上删除入站网络流量,使RabbitMQ对应用程序不可见:sudo iptables-A INPUT-p tcp-目标端口5672-j DROP 等待至少3分钟(等待网络连接超时) 修复以下连接:sudo ipta

我正在Spring AMQP v1.4.2中测试以下场景,在网络中断后无法重新连接:

  • 启动spring应用程序,该应用程序使用rabbit:listener容器和rabbit:connection工厂异步使用消息(详细配置如下)
  • 日志显示应用程序正在成功接收消息
  • 通过在rabbit服务器上删除入站网络流量,使RabbitMQ对应用程序不可见:
    sudo iptables-A INPUT-p tcp-目标端口5672-j DROP
  • 等待至少3分钟(等待网络连接超时)
  • 修复以下连接:
    sudo iptables-D INPUT-p tcp-目标端口5672-j DROP
  • 等待一段时间(甚至尝试了一个多小时),不会发生重新连接
  • 重新启动应用程序,它将再次开始接收消息,这意味着网络已恢复正常
  • 我还使用VM网络适配器断开而不是iptables断开测试了相同的场景,并且发生了相同的情况,即没有自动重新连接。有趣的是,当我尝试iptables REJECT而不是DROP时,它会按预期工作,并且一旦我删除REJECT规则,应用程序就会重新启动,但我认为REJECT更像是服务器故障,而不是网络故障

    根据报告:

    如果MessageListener由于业务异常而失败,则异常将由MessageListener容器处理,然后返回侦听另一条消息。如果故障是由断开的连接(不是业务异常)引起的,则必须取消并重新启动为侦听器收集消息的使用者SimpleMessageListenerContainer可以无缝地处理这个问题,它会留下一个日志,说明侦听器正在重新启动。事实上,它会无休止地循环尝试重新启动使用者,只有当使用者行为非常恶劣时,它才会放弃。一个副作用是,如果代理在容器启动时关闭,它将继续尝试,直到可以建立连接

    这是我在断开连接一分钟后得到的日志:

        2015-01-16 14:00:42,433 WARN  [SimpleAsyncTaskExecutor-5] org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer Consumer raised exception, processing can restart if the connection factory supports it
    com.rabbitmq.client.ShutdownSignalException: connection error
        at com.rabbitmq.client.impl.AMQConnection.startShutdown(AMQConnection.java:717) ~[amqp-client-3.4.2.jar:na]
        at com.rabbitmq.client.impl.AMQConnection.shutdown(AMQConnection.java:707) ~[amqp-client-3.4.2.jar:na]
        at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:565) ~[amqp-client-3.4.2.jar:na]
        at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55]
    Caused by: java.io.EOFException: null
        at java.io.DataInputStream.readUnsignedByte(DataInputStream.java:290) ~[na:1.7.0_55]
        at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:95) ~[amqp-client-3.4.2.jar:na]
        at com.rabbitmq.client.impl.SocketFrameHandler.readFrame(SocketFrameHandler.java:139) ~[amqp-client-3.4.2.jar:na]
        at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:534) ~[amqp-client-3.4.2.jar:na]
        ... 1 common frames omitted
    
    2015-01-16 14:18:14,551 WARN  [SimpleAsyncTaskExecutor-2] org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer Consumer raised exception, processing can restart if the connection factory supports it. Exception summary: org.springframework.amqp.AmqpConnectException: java.net.ConnectException: Connection timed out
    
    在重新连接几秒钟后,我收到了这个日志消息:

        2015-01-16 14:00:42,433 WARN  [SimpleAsyncTaskExecutor-5] org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer Consumer raised exception, processing can restart if the connection factory supports it
    com.rabbitmq.client.ShutdownSignalException: connection error
        at com.rabbitmq.client.impl.AMQConnection.startShutdown(AMQConnection.java:717) ~[amqp-client-3.4.2.jar:na]
        at com.rabbitmq.client.impl.AMQConnection.shutdown(AMQConnection.java:707) ~[amqp-client-3.4.2.jar:na]
        at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:565) ~[amqp-client-3.4.2.jar:na]
        at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55]
    Caused by: java.io.EOFException: null
        at java.io.DataInputStream.readUnsignedByte(DataInputStream.java:290) ~[na:1.7.0_55]
        at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:95) ~[amqp-client-3.4.2.jar:na]
        at com.rabbitmq.client.impl.SocketFrameHandler.readFrame(SocketFrameHandler.java:139) ~[amqp-client-3.4.2.jar:na]
        at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:534) ~[amqp-client-3.4.2.jar:na]
        ... 1 common frames omitted
    
    2015-01-16 14:18:14,551 WARN  [SimpleAsyncTaskExecutor-2] org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer Consumer raised exception, processing can restart if the connection factory supports it. Exception summary: org.springframework.amqp.AmqpConnectException: java.net.ConnectException: Connection timed out
    
    更新:很奇怪,当我在org.springframework.amqp包上启用调试日志时,重新连接成功,我无法再重现该问题

    在未启用调试日志记录的情况下,我尝试调试spring AMQP代码。我观察到,在删除iptables drop之后不久,调用了
    SimpleMessageListenerContainer.doStop()
    方法,然后调用shutdown()并取消所有通道。当我在doStop()上设置断点时,我也收到了这个日志消息,这似乎与原因有关:

    2015-01-20 15:28:44,200 ERROR [pool-1-thread-16] org.springframework.amqp.rabbit.connection.CachingConnectionFactory Channel shutdown: channel error; protocol method: #method<channel.close>(reply-code=405, reply-text=RESOURCE_LOCKED - cannot obtain exclusive access to locked queue 'e4288669-2422-40e6-a2ee-b99542509273' in vhost '/', class-id=50, method-id=10)
    2015-01-20 15:28:44,243 WARN  [SimpleAsyncTaskExecutor-3] org.springframework.amqp.rabbit.listener.BlockingQueueConsumer Failed to declare queue:e4288669-2422-40e6-a2ee-b99542509273
    2015-01-20 15:28:44,243 WARN  [SimpleAsyncTaskExecutor-3] org.springframework.amqp.rabbit.listener.BlockingQueueConsumer Queue declaration failed; retries left=0
    org.springframework.amqp.rabbit.listener.BlockingQueueConsumer$DeclarationException: Failed to declare queue(s):[e4288669-2422-40e6-a2ee-b99542509273]
        at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.attemptPassiveDeclarations(BlockingQueueConsumer.java:486) ~[spring-rabbit-1.4.2.RELEASE.jar:na]
        at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.start(BlockingQueueConsumer.java:401) ~[spring-rabbit-1.4.2.RELEASE.jar:na]
        at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.run(SimpleMessageListenerContainer.java:1022) [spring-rabbit-1.4.2.RELEASE.jar:na]
        at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55]
    2015-01-20 15:28:49,245 ERROR [pool-1-thread-16] org.springframework.amqp.rabbit.connection.CachingConnectionFactory Channel shutdown: channel error; protocol method: #method<channel.close>(reply-code=405, reply-text=RESOURCE_LOCKED - cannot obtain exclusive access to locked queue 'e4288669-2422-40e6-a2ee-b99542509273' in vhost '/', class-id=50, method-id=10)
    2015-01-20 15:28:49,283 WARN  [SimpleAsyncTaskExecutor-3] org.springframework.amqp.rabbit.listener.BlockingQueueConsumer Failed to declare queue:e4288669-2422-40e6-a2ee-b99542509273
    2015-01-20 15:28:49,300 ERROR [SimpleAsyncTaskExecutor-3] org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer Consumer received fatal exception on startup
    org.springframework.amqp.rabbit.listener.QueuesNotAvailableException: Cannot prepare queue for listener. Either the queue doesn't exist or the broker will not allow us to use it.
        at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.start(BlockingQueueConsumer.java:429) ~[spring-rabbit-1.4.2.RELEASE.jar:na]
        at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.run(SimpleMessageListenerContainer.java:1022) ~[spring-rabbit-1.4.2.RELEASE.jar:na]
        at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55]
    Caused by: org.springframework.amqp.rabbit.listener.BlockingQueueConsumer$DeclarationException: Failed to declare queue(s):[e4288669-2422-40e6-a2ee-b99542509273]
        at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.attemptPassiveDeclarations(BlockingQueueConsumer.java:486) ~[spring-rabbit-1.4.2.RELEASE.jar:na]
        at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.start(BlockingQueueConsumer.java:401) ~[spring-rabbit-1.4.2.RELEASE.jar:na]
        ... 2 common frames omitted
    2015-01-20 15:28:49,301 ERROR [SimpleAsyncTaskExecutor-3] org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer Stopping container from aborted consumer
    

    我刚刚按照描述运行了您的测试(linux上的rabbit使用
    iptables
    丢弃数据包)

    重新建立连接时没有日志(也许我们应该这样做)

    我建议您打开调试日志以查看重新连接

    编辑:

    从rabbitmq文档中:

    独家的 独占队列只能由当前连接访问,并在该连接关闭时删除。不允许其他连接被动声明独占队列

    从您的例外情况:

    回复代码=405,回复文本=RESOURCE_LOCKED-无法以独占方式访问vhost'/'中的锁定队列“e4288669-2422-40e6-a2ee-b99542509273”,类id=50,方法-

    因此,问题是代理仍然认为存在其他连接

  • 不要使用排他队列(这样的队列会丢失消息)。或者
  • 将requestedHeartbeat设置为较低的
    requestedHeartbeat
    ,以便代理更快地检测到丢失的连接

  • 我们在生产环境中也面临着这个问题,这可能是因为兔子节点在不同的ESX机架上以虚拟机的形式运行。我们发现的解决办法是,如果客户端应用程序与群集断开连接,它会不断尝试重新连接。 下面是我们应用的设置,它可以正常工作:

    <util:properties id="spring.amqp.global.properties">
      <prop key="smlc.missing.queues.fatal">false</prop>
    </util:properties>
    
    
    假的
    
    当声明队列因致命错误(代理不可用等)而失败时,此属性更改Spring AMQP的全局行为。默认情况下,容器仅尝试3次(请参阅显示“retries left=0”的日志消息)

    参考:

    此外,我们添加了恢复间隔,以便容器从非致命错误中恢复。但是,当全局行为也要重试致命错误(如缺少队列)时,也会使用相同的配置

    
    ....
    
    setRequestedHeartBeat
    设置为
    ConnectionFactory
    并将
    setMissingQueuesFatal(false)
    设置为
    SimpleMessageListenerContainer
    ,以便无限期重试连接。默认情况下,SimpleMessageListenerContainer setMissingQueuesFatal设置为true,只需重试3次

      @Bean
      public ConnectionFactory connectionFactory() {
        final CachingConnectionFactory connectionFactory = new CachingConnectionFactory(getHost(), getPort());
        connectionFactory.setUsername(getUsername());
        connectionFactory.setPassword(getPassword());
        connectionFactory.setVirtualHost(getVirtualHost());
        connectionFactory.setRequestedHeartBeat(30);
        return connectionFactory;
      }
    
      @Bean
      public SimpleMessageListenerContainer listenerContainerCopernicusErrorQueue() {
        final SimpleMessageListenerContainer container = new SimpleMessageListenerContainer();
        container.setConnectionFactory(connectionFactory());
        container.setQueueNames(myQueue().getName());
        container.setMessageListener(messageListenerAdapterQueue());
        container.setDefaultRequeueRejected(false);
        container.setMissingQueuesFatal(false);
        return container;
      }
    

    您的意思是,最早的Spring AMQP版本不存在这样的问题吗?您介意共享
    org.springframework.AMQP.rabbit.listener
    级别的
    org.springframework.AMQP.rabbit.listener
    类别的日志,以查看有关此问题的更多信息吗?顺便说一句,我刚刚在Windows上尝试过类似(或不尝试?)的
    tcpTrace
    ,并在日志中看到由java.io.EOFException:null at java.io.DataInputStream.readUnsignedByte引起的类似
    。但是当我重新启动
    trace
    时,连接就会恢复。我的AMQP客户端是
    3.4.2
    -来自Spring AMQP的可传递依赖项。不是Spring AMQP特有的,但是如果您需要重新连接和恢复队列等资源的功能,您可以尝试使用。谢谢Gary。我尝试调试日志记录,并用更多信息更新了问题。似乎在重新连接后不久,队列重新声明失败并关闭SimpleMessageListenerContainer.I