Spring AMQP v1.4.2-网络故障时兔子重新连接问题
我正在Spring AMQP v1.4.2中测试以下场景,在网络中断后无法重新连接:Spring AMQP v1.4.2-网络故障时兔子重新连接问题,spring,spring-amqp,Spring,Spring Amqp,我正在Spring AMQP v1.4.2中测试以下场景,在网络中断后无法重新连接: 启动spring应用程序,该应用程序使用rabbit:listener容器和rabbit:connection工厂异步使用消息(详细配置如下) 日志显示应用程序正在成功接收消息 通过在rabbit服务器上删除入站网络流量,使RabbitMQ对应用程序不可见:sudo iptables-A INPUT-p tcp-目标端口5672-j DROP 等待至少3分钟(等待网络连接超时) 修复以下连接:sudo ipta
sudo iptables-A INPUT-p tcp-目标端口5672-j DROP
sudo iptables-D INPUT-p tcp-目标端口5672-j DROP
2015-01-16 14:00:42,433 WARN [SimpleAsyncTaskExecutor-5] org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer Consumer raised exception, processing can restart if the connection factory supports it
com.rabbitmq.client.ShutdownSignalException: connection error
at com.rabbitmq.client.impl.AMQConnection.startShutdown(AMQConnection.java:717) ~[amqp-client-3.4.2.jar:na]
at com.rabbitmq.client.impl.AMQConnection.shutdown(AMQConnection.java:707) ~[amqp-client-3.4.2.jar:na]
at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:565) ~[amqp-client-3.4.2.jar:na]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55]
Caused by: java.io.EOFException: null
at java.io.DataInputStream.readUnsignedByte(DataInputStream.java:290) ~[na:1.7.0_55]
at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:95) ~[amqp-client-3.4.2.jar:na]
at com.rabbitmq.client.impl.SocketFrameHandler.readFrame(SocketFrameHandler.java:139) ~[amqp-client-3.4.2.jar:na]
at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:534) ~[amqp-client-3.4.2.jar:na]
... 1 common frames omitted
2015-01-16 14:18:14,551 WARN [SimpleAsyncTaskExecutor-2] org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer Consumer raised exception, processing can restart if the connection factory supports it. Exception summary: org.springframework.amqp.AmqpConnectException: java.net.ConnectException: Connection timed out
在重新连接几秒钟后,我收到了这个日志消息:
2015-01-16 14:00:42,433 WARN [SimpleAsyncTaskExecutor-5] org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer Consumer raised exception, processing can restart if the connection factory supports it
com.rabbitmq.client.ShutdownSignalException: connection error
at com.rabbitmq.client.impl.AMQConnection.startShutdown(AMQConnection.java:717) ~[amqp-client-3.4.2.jar:na]
at com.rabbitmq.client.impl.AMQConnection.shutdown(AMQConnection.java:707) ~[amqp-client-3.4.2.jar:na]
at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:565) ~[amqp-client-3.4.2.jar:na]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55]
Caused by: java.io.EOFException: null
at java.io.DataInputStream.readUnsignedByte(DataInputStream.java:290) ~[na:1.7.0_55]
at com.rabbitmq.client.impl.Frame.readFrom(Frame.java:95) ~[amqp-client-3.4.2.jar:na]
at com.rabbitmq.client.impl.SocketFrameHandler.readFrame(SocketFrameHandler.java:139) ~[amqp-client-3.4.2.jar:na]
at com.rabbitmq.client.impl.AMQConnection$MainLoop.run(AMQConnection.java:534) ~[amqp-client-3.4.2.jar:na]
... 1 common frames omitted
2015-01-16 14:18:14,551 WARN [SimpleAsyncTaskExecutor-2] org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer Consumer raised exception, processing can restart if the connection factory supports it. Exception summary: org.springframework.amqp.AmqpConnectException: java.net.ConnectException: Connection timed out
更新:很奇怪,当我在org.springframework.amqp包上启用调试日志时,重新连接成功,我无法再重现该问题
在未启用调试日志记录的情况下,我尝试调试spring AMQP代码。我观察到,在删除iptables drop之后不久,调用了SimpleMessageListenerContainer.doStop()
方法,然后调用shutdown()并取消所有通道。当我在doStop()上设置断点时,我也收到了这个日志消息,这似乎与原因有关:
2015-01-20 15:28:44,200 ERROR [pool-1-thread-16] org.springframework.amqp.rabbit.connection.CachingConnectionFactory Channel shutdown: channel error; protocol method: #method<channel.close>(reply-code=405, reply-text=RESOURCE_LOCKED - cannot obtain exclusive access to locked queue 'e4288669-2422-40e6-a2ee-b99542509273' in vhost '/', class-id=50, method-id=10)
2015-01-20 15:28:44,243 WARN [SimpleAsyncTaskExecutor-3] org.springframework.amqp.rabbit.listener.BlockingQueueConsumer Failed to declare queue:e4288669-2422-40e6-a2ee-b99542509273
2015-01-20 15:28:44,243 WARN [SimpleAsyncTaskExecutor-3] org.springframework.amqp.rabbit.listener.BlockingQueueConsumer Queue declaration failed; retries left=0
org.springframework.amqp.rabbit.listener.BlockingQueueConsumer$DeclarationException: Failed to declare queue(s):[e4288669-2422-40e6-a2ee-b99542509273]
at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.attemptPassiveDeclarations(BlockingQueueConsumer.java:486) ~[spring-rabbit-1.4.2.RELEASE.jar:na]
at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.start(BlockingQueueConsumer.java:401) ~[spring-rabbit-1.4.2.RELEASE.jar:na]
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.run(SimpleMessageListenerContainer.java:1022) [spring-rabbit-1.4.2.RELEASE.jar:na]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55]
2015-01-20 15:28:49,245 ERROR [pool-1-thread-16] org.springframework.amqp.rabbit.connection.CachingConnectionFactory Channel shutdown: channel error; protocol method: #method<channel.close>(reply-code=405, reply-text=RESOURCE_LOCKED - cannot obtain exclusive access to locked queue 'e4288669-2422-40e6-a2ee-b99542509273' in vhost '/', class-id=50, method-id=10)
2015-01-20 15:28:49,283 WARN [SimpleAsyncTaskExecutor-3] org.springframework.amqp.rabbit.listener.BlockingQueueConsumer Failed to declare queue:e4288669-2422-40e6-a2ee-b99542509273
2015-01-20 15:28:49,300 ERROR [SimpleAsyncTaskExecutor-3] org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer Consumer received fatal exception on startup
org.springframework.amqp.rabbit.listener.QueuesNotAvailableException: Cannot prepare queue for listener. Either the queue doesn't exist or the broker will not allow us to use it.
at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.start(BlockingQueueConsumer.java:429) ~[spring-rabbit-1.4.2.RELEASE.jar:na]
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.run(SimpleMessageListenerContainer.java:1022) ~[spring-rabbit-1.4.2.RELEASE.jar:na]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_55]
Caused by: org.springframework.amqp.rabbit.listener.BlockingQueueConsumer$DeclarationException: Failed to declare queue(s):[e4288669-2422-40e6-a2ee-b99542509273]
at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.attemptPassiveDeclarations(BlockingQueueConsumer.java:486) ~[spring-rabbit-1.4.2.RELEASE.jar:na]
at org.springframework.amqp.rabbit.listener.BlockingQueueConsumer.start(BlockingQueueConsumer.java:401) ~[spring-rabbit-1.4.2.RELEASE.jar:na]
... 2 common frames omitted
2015-01-20 15:28:49,301 ERROR [SimpleAsyncTaskExecutor-3] org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer Stopping container from aborted consumer
我刚刚按照描述运行了您的测试(linux上的rabbit使用
iptables
丢弃数据包)
重新建立连接时没有日志(也许我们应该这样做)
我建议您打开调试日志以查看重新连接
编辑:
从rabbitmq文档中:
独家的
独占队列只能由当前连接访问,并在该连接关闭时删除。不允许其他连接被动声明独占队列
从您的例外情况:
回复代码=405,回复文本=RESOURCE_LOCKED-无法以独占方式访问vhost'/'中的锁定队列“e4288669-2422-40e6-a2ee-b99542509273”,类id=50,方法-
因此,问题是代理仍然认为存在其他连接
requestedHeartbeat
,以便代理更快地检测到丢失的连接我们在生产环境中也面临着这个问题,这可能是因为兔子节点在不同的ESX机架上以虚拟机的形式运行。我们发现的解决办法是,如果客户端应用程序与群集断开连接,它会不断尝试重新连接。 下面是我们应用的设置,它可以正常工作:
<util:properties id="spring.amqp.global.properties">
<prop key="smlc.missing.queues.fatal">false</prop>
</util:properties>
假的
当声明队列因致命错误(代理不可用等)而失败时,此属性更改Spring AMQP的全局行为。默认情况下,容器仅尝试3次(请参阅显示“retries left=0”的日志消息)
参考:
此外,我们添加了恢复间隔,以便容器从非致命错误中恢复。但是,当全局行为也要重试致命错误(如缺少队列)时,也会使用相同的配置
....
将setRequestedHeartBeat
设置为ConnectionFactory
并将setMissingQueuesFatal(false)
设置为SimpleMessageListenerContainer
,以便无限期重试连接。默认情况下,SimpleMessageListenerContainer setMissingQueuesFatal设置为true,只需重试3次
@Bean
public ConnectionFactory connectionFactory() {
final CachingConnectionFactory connectionFactory = new CachingConnectionFactory(getHost(), getPort());
connectionFactory.setUsername(getUsername());
connectionFactory.setPassword(getPassword());
connectionFactory.setVirtualHost(getVirtualHost());
connectionFactory.setRequestedHeartBeat(30);
return connectionFactory;
}
@Bean
public SimpleMessageListenerContainer listenerContainerCopernicusErrorQueue() {
final SimpleMessageListenerContainer container = new SimpleMessageListenerContainer();
container.setConnectionFactory(connectionFactory());
container.setQueueNames(myQueue().getName());
container.setMessageListener(messageListenerAdapterQueue());
container.setDefaultRequeueRejected(false);
container.setMissingQueuesFatal(false);
return container;
}
您的意思是,最早的Spring AMQP版本不存在这样的问题吗?您介意共享
org.springframework.AMQP.rabbit.listener
级别的org.springframework.AMQP.rabbit.listener
类别的日志,以查看有关此问题的更多信息吗?顺便说一句,我刚刚在Windows上尝试过类似(或不尝试?)的tcpTrace
,并在日志中看到由java.io.EOFException:null at java.io.DataInputStream.readUnsignedByte引起的类似。但是当我重新启动trace
时,连接就会恢复。我的AMQP客户端是3.4.2
-来自Spring AMQP的可传递依赖项。不是Spring AMQP特有的,但是如果您需要重新连接和恢复队列等资源的功能,您可以尝试使用。谢谢Gary。我尝试调试日志记录,并用更多信息更新了问题。似乎在重新连接后不久,队列重新声明失败并关闭SimpleMessageListenerContainer.I