Keycloak 通信Infinispan远程异常会产生过多的网络流量

Keycloak 通信Infinispan远程异常会产生过多的网络流量,keycloak,infinispan,Keycloak,Infinispan,当Infinispan集群(版本9.4.8.Final)中发生异常时,发生异常的节点将此信息发送给集群中的其他节点。这似乎是故意的 此活动可能会导致大量通信量,从而导致超时异常,从而使节点希望将其超时异常传递给其他节点。在生产中,我们的3节点Infinispan群集完全饱和了20 Gb/s的链路。 例如,在2节点QA集群中,我们观察到以下情况: 节点1: ISPN000476:等待来自节点2的请求7861的响应时超时 节点2: ISPN000217:收到来自node1的异常,请参阅远程堆栈跟踪的

当Infinispan集群(版本9.4.8.Final)中发生异常时,发生异常的节点将此信息发送给集群中的其他节点。这似乎是故意的

此活动可能会导致大量通信量,从而导致超时异常,从而使节点希望将其超时异常传递给其他节点。在生产中,我们的3节点Infinispan群集完全饱和了20 Gb/s的链路。

例如,在2节点QA集群中,我们观察到以下情况:

节点1:

ISPN000476:等待来自节点2的请求7861的响应时超时

节点2:

ISPN000217:收到来自node1的异常,请参阅远程堆栈跟踪的原因

沿着节点2上打印的堆栈跟踪,我们可以看到:

等待来自node2的请求7861的响应超时

其中有很多。在此期间,我们捕获了一个数据包,可以看到有50KB的数据包包含一个远程错误列表及其整个java堆栈跟踪

当这种情况发生时,这是一场“完美风暴”。每次超时都会产生一个通过网络发送的错误。这会增加拥塞和超时。从那以后,情况恶化得非常快

我明白,我需要排除超时问题——寻找GC集合停顿等,并且可能考虑增加超时。然而,我想知道,当这些事件发生时,是否有方法阻止这种行为的发生。仔细想想,节点1与节点2通话超时,然后通过网络向节点2发送一份错误副本,告诉节点“我与您通话超时”,这似乎有些奇怪

有没有办法避免传输这些远程堆栈跟踪?非常感谢您提供的任何见解或建议

编辑

堆栈跟踪示例:

2019-12-06 11:37:01,587 ERROR [org.keycloak.services.error.KeycloakErrorHandler] (default task-26) Uncaught server error: org.infinispan.remoting.RemoteException: ISPN000217: Received exception from ********, see cause for remote stack trace
        at org.infinispan.remoting.transport.ResponseCollectors.wrapRemoteException(ResponseCollectors.java:28)
        at org.infinispan.remoting.transport.ValidSingleResponseCollector.withException(ValidSingleResponseCollector.java:37)
        at org.infinispan.remoting.transport.ValidSingleResponseCollector.addResponse(ValidSingleResponseCollector.java:21)
        at org.infinispan.remoting.transport.impl.SingleTargetRequest.receiveResponse(SingleTargetRequest.java:52)
        at org.infinispan.remoting.transport.impl.SingleTargetRequest.onResponse(SingleTargetRequest.java:35)
        at org.infinispan.remoting.transport.impl.RequestRepository.addResponse(RequestRepository.java:52)
        at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processResponse(JGroupsTransport.java:1372)
        at org.infinispan.remoting.transport.jgroups.JGroupsTransport.processMessage(JGroupsTransport.java:1275)
        at org.infinispan.remoting.transport.jgroups.JGroupsTransport.access$300(JGroupsTransport.java:126)
        at org.infinispan.remoting.transport.jgroups.JGroupsTransport$ChannelCallbacks.up(JGroupsTransport.java:1420)
        at org.jgroups.JChannel.up(JChannel.java:816)
        at org.jgroups.fork.ForkProtocolStack.up(ForkProtocolStack.java:133)
        at org.jgroups.stack.Protocol.up(Protocol.java:340)
        at org.jgroups.protocols.FORK.up(FORK.java:141)
        at org.jgroups.protocols.FRAG3.up(FRAG3.java:171)
        at org.jgroups.protocols.FlowControl.up(FlowControl.java:339)
        at org.jgroups.protocols.FlowControl.up(FlowControl.java:339)
        at org.jgroups.protocols.pbcast.GMS.up(GMS.java:872)
        at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:240)
        at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1008)
        at org.jgroups.protocols.UNICAST3.handleDataReceived(UNICAST3.java:734)
        at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:389)
        at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:590)
        at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:131)
        at org.jgroups.protocols.FD_ALL.up(FD_ALL.java:203)
        at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:253)
        at org.jgroups.protocols.MERGE3.up(MERGE3.java:280)
        at org.jgroups.protocols.Discovery.up(Discovery.java:295)
        at org.jgroups.protocols.TP.passMessageUp(TP.java:1249)
        at org.jgroups.util.SubmitToThreadPool$SingleMessageHandler.run(SubmitToThreadPool.java:87)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at org.jboss.as.clustering.jgroups.ClassLoaderThreadFactory.lambda$newThread$0(ClassLoaderThreadFactory.java:52)
        at java.lang.Thread.run(Thread.java:745)
        Suppressed: org.infinispan.util.logging.TraceException
                at org.infinispan.interceptors.impl.SimpleAsyncInvocationStage.get(SimpleAsyncInvocationStage.java:41)
                at org.infinispan.interceptors.impl.AsyncInterceptorChainImpl.invoke(AsyncInterceptorChainImpl.java:250)
                at org.infinispan.cache.impl.CacheImpl.executeCommandAndCommitIfNeeded(CacheImpl.java:1918)
                at org.infinispan.cache.impl.CacheImpl.put(CacheImpl.java:1433)
                at org.infinispan.cache.impl.DecoratedCache.put(DecoratedCache.java:685)
                at org.infinispan.cache.impl.DecoratedCache.put(DecoratedCache.java:240)
                at org.infinispan.cache.impl.AbstractDelegatingCache.put(AbstractDelegatingCache.java:116)
                at org.infinispan.cache.impl.AbstractDelegatingCache.put(AbstractDelegatingCache.java:116)
                at org.infinispan.cache.impl.EncoderCache.put(EncoderCache.java:195)
                at org.infinispan.cache.impl.AbstractDelegatingCache.put(AbstractDelegatingCache.java:116)
                at org.keycloak.cluster.infinispan.InfinispanNotificationsManager.notify(InfinispanNotificationsManager.java:155)
                at org.keycloak.cluster.infinispan.InfinispanClusterProvider.notify(InfinispanClusterProvider.java:130)
                at org.keycloak.models.cache.infinispan.CacheManager.sendInvalidationEvents(CacheManager.java:206)
                at org.keycloak.models.cache.infinispan.UserCacheSession.runInvalidations(UserCacheSession.java:140)
                at org.keycloak.models.cache.infinispan.UserCacheSession$1.commit(UserCacheSession.java:152)
                at org.keycloak.services.DefaultKeycloakTransactionManager.commit(DefaultKeycloakTransactionManager.java:146)
                at org.keycloak.services.resources.admin.UsersResource.createUser(UsersResource.java:125)
                at sun.reflect.GeneratedMethodAccessor487.invoke(Unknown Source)
                at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
                at java.lang.reflect.Method.invoke(Method.java:498)
                at org.jboss.resteasy.core.MethodInjectorImpl.invoke(MethodInjectorImpl.java:139)
                at org.jboss.resteasy.core.ResourceMethodInvoker.internalInvokeOnTarget(ResourceMethodInvoker.java:510)
                at org.jboss.resteasy.core.ResourceMethodInvoker.invokeOnTargetAfterFilter(ResourceMethodInvoker.java:400)
                at org.jboss.resteasy.core.ResourceMethodInvoker.lambda$invokeOnTarget$0(ResourceMethodInvoker.java:364)
                at org.jboss.resteasy.core.interception.PreMatchContainerRequestContext.filter(PreMatchContainerRequestContext.java:355)
                at org.jboss.resteasy.core.ResourceMethodInvoker.invokeOnTarget(ResourceMethodInvoker.java:366)
                at org.jboss.resteasy.core.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:338)
                at org.jboss.resteasy.core.ResourceLocatorInvoker.invokeOnTargetObject(ResourceLocatorInvoker.java:137)
                at org.jboss.resteasy.core.ResourceLocatorInvoker.invoke(ResourceLocatorInvoker.java:106)
                at org.jboss.resteasy.core.ResourceLocatorInvoker.invokeOnTargetObject(ResourceLocatorInvoker.java:132)
                at org.jboss.resteasy.core.ResourceLocatorInvoker.invoke(ResourceLocatorInvoker.java:106)
                at org.jboss.resteasy.core.ResourceLocatorInvoker.invokeOnTargetObject(ResourceLocatorInvoker.java:132)
                at org.jboss.resteasy.core.ResourceLocatorInvoker.invoke(ResourceLocatorInvoker.java:100)
                at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:439)
                at org.jboss.resteasy.core.SynchronousDispatcher.lambda$invoke$4(SynchronousDispatcher.java:229)
                at org.jboss.resteasy.core.SynchronousDispatcher.lambda$preprocess$0(SynchronousDispatcher.java:135)
                at org.jboss.resteasy.core.interception.PreMatchContainerRequestContext.filter(PreMatchContainerRequestContext.java:355)
                at org.jboss.resteasy.core.SynchronousDispatcher.preprocess(SynchronousDispatcher.java:138)
                at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:215)
                at org.jboss.resteasy.plugins.server.servlet.ServletContainerDispatcher.service(ServletContainerDispatcher.java:227)
                at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.service(HttpServletDispatcher.java:56)
                at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.service(HttpServletDispatcher.java:51)
                at javax.servlet.http.HttpServlet.service(HttpServlet.java:791)
                at io.undertow.servlet.handlers.ServletHandler.handleRequest(ServletHandler.java:74)
                at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:129)
                at org.keycloak.services.filters.KeycloakSessionServletFilter.doFilter(KeycloakSessionServletFilter.java:90)
                at io.undertow.servlet.core.ManagedFilter.doFilter(ManagedFilter.java:61)
                at io.undertow.servlet.handlers.FilterHandler$FilterChainImpl.doFilter(FilterHandler.java:131)
                at io.undertow.servlet.handlers.FilterHandler.handleRequest(FilterHandler.java:84)
                at io.undertow.servlet.handlers.security.ServletSecurityRoleHandler.handleRequest(ServletSecurityRoleHandler.java:62)
                at io.undertow.servlet.handlers.ServletChain$1.handleRequest(ServletChain.java:68)
                at io.undertow.servlet.handlers.ServletDispatchingHandler.handleRequest(ServletDispatchingHandler.java:36)
                at org.wildfly.extension.undertow.security.SecurityContextAssociationHandler.handleRequest(SecurityContextAssociationHandler.java:78)
                at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
                at io.undertow.servlet.handlers.security.SSLInformationAssociationHandler.handleRequest(SSLInformationAssociationHandler.java:132)
                at io.undertow.servlet.handlers.security.ServletAuthenticationCallHandler.handleRequest(ServletAuthenticationCallHandler.java:57)
                at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
                at io.undertow.security.handlers.AbstractConfidentialityHandler.handleRequest(AbstractConfidentialityHandler.java:46)
                at io.undertow.servlet.handlers.security.ServletConfidentialityConstraintHandler.handleRequest(ServletConfidentialityConstraintHandler.java:64)
                at io.undertow.security.handlers.AuthenticationMechanismsHandler.handleRequest(AuthenticationMechanismsHandler.java:60)
                at io.undertow.servlet.handlers.security.CachedAuthenticatedSessionHandler.handleRequest(CachedAuthenticatedSessionHandler.java:77)
                at io.undertow.security.handlers.NotificationReceiverHandler.handleRequest(NotificationReceiverHandler.java:50)
                at io.undertow.security.handlers.AbstractSecurityContextAssociationHandler.handleRequest(AbstractSecurityContextAssociationHandler.java:43)
                at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
                at org.wildfly.extension.undertow.security.jacc.JACCContextIdHandler.handleRequest(JACCContextIdHandler.java:61)
                at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
                at org.wildfly.extension.undertow.deployment.GlobalRequestControllerHandler.handleRequest(GlobalRequestControllerHandler.java:68)
                at io.undertow.server.handlers.PredicateHandler.handleRequest(PredicateHandler.java:43)
                at io.undertow.servlet.handlers.ServletInitialHandler.handleFirstRequest(ServletInitialHandler.java:292)
                at io.undertow.servlet.handlers.ServletInitialHandler.access$100(ServletInitialHandler.java:81)
                at io.undertow.servlet.handlers.ServletInitialHandler$2.call(ServletInitialHandler.java:138)
                at io.undertow.servlet.handlers.ServletInitialHandler$2.call(ServletInitialHandler.java:135)
                at io.undertow.servlet.core.ServletRequestContextThreadSetupAction$1.call(ServletRequestContextThreadSetupAction.java:48)
                at io.undertow.servlet.core.ContextClassLoaderSetupAction$1.call(ContextClassLoaderSetupAction.java:43)
                at org.wildfly.extension.undertow.security.SecurityContextThreadSetupAction.lambda$create$0(SecurityContextThreadSetupAction.java:105)
                at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
                at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
                at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
                at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
                at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
                at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService$UndertowThreadSetupAction.lambda$create$0(UndertowDeploymentInfoService.java:1502)
                at io.undertow.servlet.handlers.ServletInitialHandler.dispatchRequest(ServletInitialHandler.java:272)
                at io.undertow.servlet.handlers.ServletInitialHandler.access$000(ServletInitialHandler.java:81)
                at io.undertow.servlet.handlers.ServletInitialHandler$1.handleRequest(ServletInitialHandler.java:104)
                at io.undertow.server.Connectors.executeRootHandler(Connectors.java:364)
                at io.undertow.server.HttpServerExchange$1.run(HttpServerExchange.java:830)
                at org.jboss.threads.ContextClassLoaderSavingRunnable.run(ContextClassLoaderSavingRunnable.java:35)
                at org.jboss.threads.EnhancedQueueExecutor.safeRun(EnhancedQueueExecutor.java:1982)
                at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.doRunTask(EnhancedQueueExecutor.java:1486)
                at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1363)
                ... 1 more
Caused by: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 7865 from ********
        at org.infinispan.remoting.transport.impl.MultiTargetRequest.onTimeout(MultiTargetRequest.java:167)
        at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:87)
        at org.infinispan.remoting.transport.AbstractRequest.call(AbstractRequest.java:22)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        ... 1 more

不,无法禁用Infinispan 9.4.x中堆栈跟踪的序列化

Infinispan 10.0.0.Final在异常响应中不包括堆栈跟踪,但这只是其他一些工作的副作用,我已经打开添加回远程堆栈跟踪


请在问题中添加注释,并包括完整的堆栈跟踪示例。

否,无法在Infinispan 9.4.x中禁用堆栈跟踪序列化

Infinispan 10.0.0.Final在异常响应中不包括堆栈跟踪,但这只是其他一些工作的副作用,我已经打开添加回远程堆栈跟踪


请在问题中添加评论,并包括完整的堆栈跟踪示例。

我们能够解决导致网络流量大幅增加的问题。详情如下

tl;博士

我们从JGroups UDP堆栈切换到TCP堆栈,上面提到的ISPN文档可以更有效地用于使用我们这样的分布式缓存的小型集群

复制问题 为了重现这个问题,我们做了以下工作:

  • 将清除ISPN缓存项的KeyClope后台作业配置为每60秒运行一次,这样我们就不必等待作业以15分钟的间隔运行(standalone ha.xml):

    60

  • 生成大量用户会话(我们使用jmeter)。在我们的测试中,我们最终生成了大约100000个会话

  • 将KeyClope中的SSO Idle和Max time TTL配置为极短,以便所有会话都将过期(60秒)
  • 继续使用jmeter在系统上加载(这一部分很重要)
当清除缓存的作业运行时,我们会看到网络流量泛滥(120 MB/s)。发生这种情况时,我们会在群集中的每个节点上看到大量以下错误:

ISPN000476:等待来自的请求7861的响应时超时 节点2

ISPN000217:从节点1收到异常,请参阅远程堆栈的原因 痕迹

Pro提示:配置文件存储的钝化,以持久保存ISPN数据。关闭群集并将“.dat”文件保存到其他位置。在测试之间,使用这些文件立即恢复ISPN群集的状态

解决问题 使用上述技术,我们能够根据需要复制该问题。因此,我们开始使用下面描述的方法来解决它

将JGroups堆栈更改为使用TCP 我们将JGroups堆栈从UDP更改为TCP,并为发现配置了TCping。我们在阅读了以下指南中对TCP堆栈的描述后执行了此操作:

具体而言:

“使用TCP进行传输,使用UDP多播进行发现。适用于 只有在使用分布式系统时,才能使用较小的群集(100个节点以下) 缓存,因为TCP作为点对点协议比UDP更有效 协议。”

这一变化完全消除了我们的问题

standalone-ha.xml中的Wildfly 16配置如下:

<subsystem xmlns="urn:jboss:domain:jgroups:6.0">
        <channels default="ee">
            <channel name="ee" stack="tcp" cluster="ejb"/>
        </channels>
        <stacks>
            <stack name="udp">
                <transport type="UDP" socket-binding="jgroups-udp"/>
                <protocol type="PING"/>
                <protocol type="MERGE3"/>
                <protocol type="FD_SOCK"/>
                <protocol type="FD_ALL"/>
                <protocol type="VERIFY_SUSPECT"/>
                <protocol type="pbcast.NAKACK2"/>
                <protocol type="UNICAST3"/>
                <protocol type="pbcast.STABLE"/>
                <protocol type="pbcast.GMS"/>
                <protocol type="UFC"/>
                <protocol type="MFC"/>
                <protocol type="FRAG3"/>
            </stack>
            <stack name="tcp">
                <transport type="TCP" socket-binding="jgroups-tcp"/>
                <socket-protocol type="TCPPING" socket-binding="jgroups-tcp">
                  <property name="initial_hosts">HOST-X[7600],HOST-Y[7600],HOST-Z[7600]</property>
                  <property name="port_range">1</property>
                </socket-protocol>
                <protocol type="MERGE3"/>
                <protocol type="FD_SOCK"/>
                <protocol type="FD_ALL"/>
                <protocol type="VERIFY_SUSPECT"/>
                <protocol type="pbcast.NAKACK2"/>
                <protocol type="UNICAST3"/>
                <protocol type="pbcast.STABLE"/>
                <protocol type="pbcast.GMS"/>
                <protocol type="MFC"/>
                <protocol type="FRAG3"/>
            </stack>
        </stacks>
    </subsystem>
杂项变动 我们对我们的环境应用了其他特定的变化:

  • 在iptables级别阻止IP多播+UDP(因为我们想确保我们只是TCP)
  • 在网络级别配置带宽上限,以防止ISPN群集使网络饱和并影响使用相同链路的其他主机

我们能够解决导致网络流量大幅飙升的问题。详情如下

tl;博士

我们从JGroups UDP堆栈切换到TCP堆栈,上面提到的ISPN文档可以更有效地用于使用我们这样的分布式缓存的小型集群

复制问题 为了重现这个问题,我们做了以下工作:

  • 将清除ISPN缓存项的KeyClope后台作业配置为每60秒运行一次,这样我们就不必等待
    -Xms6144m -Xmx6144m -Xmn1536M -XX:MetaspaceSize=192M -XX:MaxMetaspaceSize=512m -Djava.net.preferIPv4Stack=true -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+DisableExplicitGC