Java Netty服务器未关闭/释放套接字

Java Netty服务器未关闭/释放套接字,java,sockets,netty,Java,Sockets,Netty,我在netty服务器应用程序中面临资源问题 [io.netty.channel.DefaultChannelPipeline] An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.: java.io.IOException: To

我在netty服务器应用程序中面临资源问题

[io.netty.channel.DefaultChannelPipeline] An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.: java.io.IOException: Too many open files
    at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) [rt.jar:1.7.0_60]
    at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:241) [rt.jar:1.7.0_60]
    at io.netty.channel.socket.nio.NioServerSocketChannel.doReadMessages(NioServerSocketChannel.java:135) [netty-all-4.0.25.Final.jar:4.0.25.Final]
    at io.netty.channel.nio.AbstractNioMessageChannel$NioMessageUnsafe.read(AbstractNioMessageChannel.java:69) [netty-all-4.0.25.Final.jar:4.0.25.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) [netty-all-4.0.25.Final.jar:4.0.25.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) [netty-all-4.0.25.Final.jar:4.0.25.Final]
    at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) [netty-all-4.0.25.Final.jar:4.0.25.Final]
    at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) [netty-all-4.0.25.Final.jar:4.0.25.Final]
    at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) [netty-all-4.0.25.Final.jar:4.0.25.Final]
    at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) [netty-all-4.0.25.Final.jar:4.0.25.Final]
    at java.lang.Thread.run(Thread.java:745) [rt.jar:1.7.0_60]
作为一种解决方法,我使用ulimit-n增加了最大打开文件数,但我仍然可以应对文件/套接字数量的增加:

lsof -p 5604 | grep socket | wc -l
现在超过3000人

无法看到任何打开或挂起的与netstat的连接

我使用ReadTimeoutHandler通过以下exceptionHandler代码关闭未使用的连接:

@Override
public void exceptionCaught(ChannelHandlerContext ctx, Throwable cause) throws Exception {
  if (cause instanceof ReadTimeoutException) {
    logger.debug("Read timeout - close connection");
  } else {
    logger.info(cause.getMessage());
  }
  ctx.close();
}
服务器引导程序如下所示:

ServerBootstrap b = new ServerBootstrap();
b.group(bossGroup, workerGroup).channel(NioServerSocketChannel.class).childHandler(new ChannelInitializer<SocketChannel>() {
@Override
public void initChannel(SocketChannel ch) throws Exception {
    ch.pipeline().addLast(new ReadTimeoutHandler(60));
    ch.pipeline().addLast(new LoggingHandler(mySpec.getPortLookupKey().toLowerCase()));
    ch.pipeline().addLast(new RawMessageEncoder());
    ch.pipeline().addLast(new RawMessageDecoder());
    ch.pipeline().addLast(new RequestServerHandler(ctx.getWorkManager(), factory));
}
}).option(ChannelOption.SO_BACKLOG, 128).childOption(ChannelOption.SO_KEEPALIVE, true);

ChannelFuture channelFuture = b.bind(port).sync();
客户端未断开连接的情况:

10:04:24,104 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] REGISTERED
10:04:24,107 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] ACTIVE
10:04:24,594 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] RECEIVED(1024B)
10:04:24,597 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] RECEIVED(1024B)
10:04:24,598 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] RECEIVED(150B)
10:04:25,638 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] WRITE(1383B)
10:04:25,639 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] FLUSH
10:05:25,389 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] CLOSE()
10:05:25,390 [3-1] [id: 0x48076684, /127.0.0.1:50525 :> /127.0.0.1:4300] CLOSE()
10:05:25,390 [3-1] [id: 0x48076684, /127.0.0.1:50525 :> /127.0.0.1:4300] INACTIVE
10:05:25,394 [3-1] [id: 0x48076684, /127.0.0.1:50525 :> /127.0.0.1:4300] UNREGISTERED
因此,在结束之前有60秒的间隔(正如ReadTimeoutHandler所预期的那样)


经过更多的分析,我有一个印象,即使与客户端正常断开连接,打开的文件数量也会增加!而且,在这种情况下没有结束的余地……

可能与这个棘手的问题有关

这是预期的行为,是无法改变的。这个 JVM正在发出信号,表示它无法接受通道-因此 无法启动连接,也无法发送响应。客户 将看到连接失败。如果您有一个负载平衡器,它应该 在备用主机上重试,或在主机上返回503 代表申请


您使用的是哪个Netty版本?能否在管道中的
ReadTimeoutHandler
之前插入一个日志级别足够高的
LoggingHandler
,并用日志更新您的问题?也许,这与问题更相关,因此,这些连接事件对我来说似乎很正常,我从未见过这样的问题。有一个新的netty版本4.0.28,你能试试吗?那么你是在对等机断开连接时关闭连接吗?@EJP:我的所有处理程序都包含这个方法:“public void exceptionCaught(ChannelHandlerContext ctx,Throwable caught)抛出异常{ctx.channel().close();}”我还需要做什么来关闭连接?
10:04:24,104 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] REGISTERED
10:04:24,107 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] ACTIVE
10:04:24,594 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] RECEIVED(1024B)
10:04:24,597 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] RECEIVED(1024B)
10:04:24,598 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] RECEIVED(150B)
10:04:25,638 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] WRITE(1383B)
10:04:25,639 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] FLUSH
10:05:25,389 [3-1] [id: 0x48076684, /127.0.0.1:50525 => /127.0.0.1:4300] CLOSE()
10:05:25,390 [3-1] [id: 0x48076684, /127.0.0.1:50525 :> /127.0.0.1:4300] CLOSE()
10:05:25,390 [3-1] [id: 0x48076684, /127.0.0.1:50525 :> /127.0.0.1:4300] INACTIVE
10:05:25,394 [3-1] [id: 0x48076684, /127.0.0.1:50525 :> /127.0.0.1:4300] UNREGISTERED