Apache Ignite.NET:更新到v2.9后,某些Ignite节点无法启动-检测到堆栈崩溃

Apache Ignite.NET:更新到v2.9后,某些Ignite节点无法启动-检测到堆栈崩溃,ignite,Ignite,我正在Linux节点上的Kubernetes集群中运行ApacheIgnite.Net 最近,我将ignite 2.8.1群集更新为v2.9。更新后,作为群集一部分的某些服务无法启动,并显示以下消息: ***检测到堆栈崩溃***:已终止 有趣的是,这通常发生在同一微服务的第二个实例上。第一个实例通常会成功启动(但有时第一个实例也会失败)。另一个观察结果是,它发生在发布服务网格服务的节点上。有时,完全集群循环(杀死所有节点,然后再次旋转它们)有助于启动所有节点,有时则没有 我在更新过程中搞砸了什么

我正在Linux节点上的Kubernetes集群中运行ApacheIgnite.Net

最近,我将ignite 2.8.1群集更新为v2.9。更新后,作为群集一部分的某些服务无法启动,并显示以下消息:

***检测到堆栈崩溃***:已终止

有趣的是,这通常发生在同一微服务的第二个实例上。第一个实例通常会成功启动(但有时第一个实例也会失败)。另一个观察结果是,它发生在发布服务网格服务的节点上。有时,完全集群循环(杀死所有节点,然后再次旋转它们)有助于启动所有节点,有时则没有

我在更新过程中搞砸了什么吗?我首先应该检查什么

下面是引燃日志的摘录

2020-12-08 22:05:25,683 [1] DEBUG  [(null)] - Classpath resolved to: /app/libs/spring-jdbc-4.3.26.RELEASE.jar;/app/libs/spring-messaging-4.3.29.RELEASE.jar;/app/libs/ignite-indexing-2.9.0.jar;/app/libs/opencensus-impl-core-0.22.0.jar;/app/libs/jackson-annotations-2.10.1.jar;/app/libs/lucene-analyzers-common-7.4.0.jar;/app/libs/jackson-dataformat-smile-2.10.1.jar;/app/libs/commons-logging-1.1.1.jar;/app/libs/spring-context-4.3.26.RELEASE.jar;/app/libs/tyrus-standalone-client-1.15.jar;/app/libs/jackson-core-2.10.1.jar;/app/libs/spring-core-4.3.29.RELEASE.jar;/app/libs/control-center-agent-2.9.0.0.jar;/app/libs/commons-codec-1.11.jar;/app/libs/disruptor-3.4.2.jar;/app/libs/javassist-3.21.0-GA.jar;/app/libs/spring-tx-4.3.26.RELEASE.jar;/app/libs/spring-core-4.3.26.RELEASE.jar;/app/libs/commons-logging-1.2.jar;/app/libs/spring-beans-4.3.26.RELEASE.jar;/app/libs/h2-1.4.197.jar;/app/libs/ignite-core-2.9.0.jar;/app/libs/spring-aop-4.3.26.RELEASE.jar;/app/libs/reflections8-0.11.7.jar;/app/libs/cache-api-1.0.0.jar;/app/libs/spring-websocket-4.3.29.RELEASE.jar;/app/libs/lucene-core-7.4.0.jar;/app/libs/jackson-databind-2.10.1.jar;/app/libs/ignite-spring-2.9.0.jar;/app/libs/grpc-context-1.19.0.jar;/app/libs/lucene-queryparser-7.4.0.jar;/app/libs/spring-web-4.3.29.RELEASE.jar;/app/libs/ignite-shmem-1.0.0.jar;/app/libs/guava-26.0-android.jar;/app/libs/spring-expression-4.3.26.RELEASE.jar:/app/libs/spring-jdbc-4.3.26.RELEASE.jar:/app/libs/spring-messaging-4.3.29.RELEASE.jar:/app/libs/ignite-indexing-2.9.0.jar:/app/libs/opencensus-impl-core-0.22.0.jar:/app/libs/jackson-annotations-2.10.1.jar:/app/libs/lucene-analyzers-common-7.4.0.jar:/app/libs/jackson-dataformat-smile-2.10.1.jar:/app/libs/commons-logging-1.1.1.jar:/app/libs/spring-context-4.3.26.RELEASE.jar:/app/libs/tyrus-standalone-client-1.15.jar:/app/libs/jackson-core-2.10.1.jar:/app/libs/spring-core-4.3.29.RELEASE.jar:/app/libs/control-center-agent-2.9.0.0.jar:/app/libs/commons-codec-1.11.jar:/app/libs/disruptor-3.4.2.jar:/app/libs/javassist-3.21.0-GA.jar:/app/libs/spring-tx-4.3.26.RELEASE.jar:/app/libs/spring-core-4.3.26.RELEASE.jar:/app/libs/commons-logging-1.2.jar:/app/libs/spring-beans-4.3.26.RELEASE.jar:/app/libs/h2-1.4.197.jar:/app/libs/ignite-core-2.9.0.jar:/app/libs/spring-aop-4.3.26.RELEASE.jar:/app/libs/reflections8-0.11.7.jar:/app/libs/cache-api-1.0.0.jar:/app/libs/spring-websocket-4.3.29.RELEASE.jar:/app/libs/lucene-core-7.4.0.jar:/app/libs/jackson-databind-2.10.1.jar:/app/libs/ignite-spring-2.9.0.jar:/app/libs/grpc-context-1.19.0.jar:/app/libs/lucene-queryparser-7.4.0.jar:/app/libs/spring-web-4.3.29.RELEASE.jar:/app/libs/ignite-shmem-1.0.0.jar:/app/libs/guava-26.0-android.jar:/app/libs/spring-expression-4.3.26.RELEASE.jar:
2020-12-08 22:05:25,860 [1] DEBUG  [(null)] - JVM started.
[22:05:26,184][INFO][main][XmlBeanDefinitionReader] Loading XML bean definitions from URL [file:/app/./kubernetes.config
...
2020-12-08 22:05:37,936 [70] INFO  org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander [(null)] - Completed rebalance future: RebalanceFuture [state=STARTED, grp=CacheGroupContext [grp=ignite-sys-cache], topVer=AffinityTopologyVersion [topVer=82, minorTopVer=0], rebalanceId=1, routines=4, receivedBytes=1200, receivedKeys=0, partitionsLeft=0, startTime=1607465137846, endTime=-1, lastCancelledTime=-1, next=null]
2020-12-08 22:05:37,936 [70] DEBUG org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander [(null)] - Partitions have been scheduled to resend [reason=Rebalance is done, grp=ignite-sys-cache]
2020-12-08 22:05:37,937 [70] DEBUG org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander [(null)] - Finished rebalancing partition: [grp=ignite-sys-cache, topVer=AffinityTopologyVersion [topVer=82, minorTopVer=0], supplier=12ca76f0-3286-4779-a426-408d5d6cf226, p=61]
2020-12-08 22:05:37,937 [70] DEBUG org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionDemander [(null)] - Will not request next demand message [grp=ignite-sys-cache, topVer=AffinityTopologyVersion [topVer=82, minorTopVer=0], supplier=12ca76f0-3286-4779-a426-408d5d6cf226, rebalanceFuture=RebalanceFuture [state=STARTED, grp=CacheGroupContext [grp=ignite-sys-cache], topVer=AffinityTopologyVersion [topVer=82, minorTopVer=0], rebalanceId=1, routines=4, receivedBytes=1200, receivedKeys=0, partitionsLeft=0, startTime=1607465137846, endTime=1607465137937, lastCancelledTime=-1, next=null]]
2020-12-08 22:05:37,943 [71] DEBUG org.apache.ignite.internal.processors.odbc.ClientListenerProcessor [(null)] - Grid runnable started: nio-acceptor-client-listener
2020-12-08 22:05:37,944 [72] DEBUG org.apache.ignite.internal.processors.odbc.ClientListenerProcessor [(null)] - Grid runnable started: grid-nio-worker-client-listener-0
2020-12-08 22:05:37,944 [1] DEBUG org.apache.ignite.internal.processors.service.IgniteServiceProcessor [(null)] - Started service processor.
2020-12-08 22:05:37,954 [73] DEBUG org.apache.ignite.internal.processors.service.ServiceDeploymentManager [(null)] - Grid runnable started: services-deployment-worker
2020-12-08 22:05:37,955 [73] DEBUG org.apache.ignite.internal.processors.service.ServiceDeploymentTask [(null)] - Started services deployment task init: [depId=ServiceDeploymentProcessId [topVer=AffinityTopologyVersion [topVer=81, minorTopVer=0], reqId=null], locId=c894369e-d55b-4d7b-8e5e-c990d0547121, evt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=c894369e-d55b-4d7b-8e5e-c990d0547121, consistentId=product-service-deployment-7c69d99ff6-vc6nb, addrs=ArrayList [10.0.2.27, 127.0.0.1], sockAddrs=HashSet [/127.0.0.1:47500, product-service-deployment-7c69d99ff6-vc6nb/10.0.2.27:47500], discPort=47500, order=81, intOrder=44, lastExchangeTime=1607465137554, loc=true, ver=2.9.0#20201015-sha1:70742da8, isClient=false], topVer=81, msgTemplate=null, span=org.apache.ignite.internal.processors.tracing.NoopSpan@3f4cf36, nodeId8=c894369e, msg=null, type=NODE_JOINED, tstamp=1607465136027]]
2020-12-08 22:05:38,017 [73] DEBUG org.apache.ignite.internal.processors.resource.GridResourceProcessor [(null)] - Injecting resources [obj=org.apache.ignite.internal.processors.platform.cluster.PlatformClusterNodeFilterImpl@5d421915]
2020-12-08 22:05:38,038 [1] DEBUG org.apache.ignite.internal.processors.rest.GridRestProcessor [(null)] - REST processor started.
2020-12-08 22:05:38,056 [74] DEBUG org.apache.ignite.internal.processors.rest.GridRestProcessor [(null)] - Grid runnable started: session-timeout-worker
2020-12-08 22:05:38,098 [32] DEBUG org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor [(null)] - Timeout has occurred [obj=CancelableTask [id=d5e43644671-3ea29289-4345-4d80-8eab-97397473a5a9, endTime=1607465138070, period=10000, cancel=false, task=org.apache.ignite.internal.processors.query.h2.ConnectionManager$$Lambda$307/57085696@6197e588], process=true]
2020-12-08 22:05:38,110 [1] DEBUG org.apache.ignite.internal.processors.resource.GridResourceProcessor [(null)] - Injecting resources [obj=org.gridgain.control.agent.processor.lifecycle.ClusterLifecycleProcessor$$Lambda$586/893320639@55cff952]
2020-12-08 22:05:38,142 [75] DEBUG org.apache.ignite.internal.managers.communication.GridIoManager [(null)] - Message set has not been changed: GridCommunicationMessageSet [nodeId=3f89e86c-f636-4324-895b-1a77cec8ed11, endTime=1607465141249, timeoutId=8fe43644671-3ea29289-4345-4d80-8eab-97397473a5a9, topic=TOPIC_COMM_USER, plc=0, msgs=ConcurrentLinkedDeque [], reserved=false, timeout=5000, skipOnTimeout=true, lastTs=1607465136249]
2020-12-08 22:05:38,148 [1] WARN  org.gridgain.control.agent.ControlCenterAgent [(null)] - Current Ignite configuration does not support tracing functionality and Control Center agent will not collect traces (consider adding ignite-opencensus module to classpath).
2020-12-08 22:05:38,152 [1] DEBUG org.apache.ignite.internal.processors.resource.GridResourceProcessor [(null)] - Injecting resources [obj=org.gridgain.control.agent.ControlCenterAgent$$Lambda$591/1985869725@151335cb]
2020-12-08 22:05:38,175 [76] DEBUG org.apache.ignite.internal.managers.communication.GridIoManager [(null)] - Message set has not been changed: GridCommunicationMessageSet [nodeId=3f89e86c-f636-4324-895b-1a77cec8ed11, endTime=1607465141249, timeoutId=8fe43644671-3ea29289-4345-4d80-8eab-97397473a5a9, topic=TOPIC_COMM_USER, plc=0, msgs=ConcurrentLinkedDeque [], reserved=false, timeout=5000, skipOnTimeout=true, lastTs=1607465136249]
2020-12-08 22:05:38,476 [73] DEBUG org.apache.ignite.internal.processors.service.ServiceDeploymentTask [(null)] - Calculated service assignment : [srvcId=56296344671-81118589-d216-4762-a835-3df2230389c5, srvcTop={c894369e-d55b-4d7b-8e5e-c990d0547121=1, 3f89e86c-f636-4324-895b-1a77cec8ed11=1}]
2020-12-08 22:05:38,484 [73] DEBUG org.apache.ignite.internal.processors.resource.GridResourceProcessor [(null)] - Injecting resources [obj=org.apache.ignite.internal.processors.platform.dotnet.PlatformDotNetServiceImpl@20119802]
*** stack smashing detected ***: <unknown> terminated
2020-12-08 22:05:25683[1]调试[(null)]-类路径解析为:/app/libs/spring-jdbc-4.3.26.RELEASE.jar/app/libs/spring-messaging-4.3.29.RELEASE.jar/app/libs/ignite-index-2.9.0.jar/app/libs/opencensus-impl-core-0.22.0.jar/app/libs/jackson-annotations-2.10.1.jar/app/libs/lucene-analyzers-common-7.4.0.jar/app/libs/jackson-dataformat-smile-2.10.1.jar/app/libs/commons-logging-1.1.jar/app/libs/spring-context-4.3.26.RELEASE.jar/app/libs/tyrus-standalone-client-1.15.jar/app/libs/jackson-core-2.10.1.jar/app/libs/spring-core-4.3.29.RELEASE.jar/app/libs/control-center-agent-2.9.0.0.jar/app/libs/commons-codec-1.11.jar/app/libs/disruptor-3.4.2.jar/app/libs/javassist-3.21.0-GA.jar/app/libs/spring-tx-4.3.26.RELEASE.jar/app/libs/spring-core-4.3.26.RELEASE.jar/app/libs/commons-logging-1.2.jar/app/libs/spring-beans-4.3.26.RELEASE.jar/app/libs/h2-1.4.197.jar/app/libs/ignite-core-2.9.0.jar/app/libs/spring-aop-4.3.26.RELEASE.jar/app/libs/reflections8-0.11.7.jar/app/libs/cache-api-1.0.0.jar/app/libs/spring-websocket-4.3.29.RELEASE.jar/app/libs/lucene-core-7.4.0.jar/app/libs/jackson-databind-2.10.1.jar/app/libs/ignite-spring-2.9.0.jar/app/libs/grpc-context-1.19.0.jar/app/libs/lucene-queryparser-7.4.0.jar/app/libs/spring-web-4.3.29.RELEASE.jar/app/libs/ignite-shmem-1.0.0.jar/app/libs/guava-26.0-android.jar/app/libs/spring-expression-4.3.26.RELEASE.jar:/app/libs/spring-jdbc-4.3.26.RELEASE.jar:/app/libs/spring-messaging-4.3.29.RELEASE.jar:/app/libs/opencensus-impl-core-0.22.0.jar:/app/libs/jackson-annotations-2.10.1.jar:/app/libs/lucene-analyzers-common-7.4.0.jar:/app/libs/dataformat-smile-2.10.10.1bs/commons-logging-1.1.1.jar:/app/libs/spring-context-4.3.26.RELEASE.jar:/app/libs/tyrus-standalone-client-1.15.jar:/app/libs/jackson-core-2.10.1.jar:/app/libs/spring-core-4.3.29.RELEASE.jar:/app/libs/control-center-agent-agent-2.9.0.jar:/app/libs/commons-codec-1.11.jar:/app/libs/libs/disruptor-3.4.2.2.jar:/appx-4.3.26.RELEASE.jar:/app/libs/spring-core-4.3.26.RELEASE.jar:/app/libs/commons-logging-1.2.jar:/app/libs/spring-beans-4.3.26.RELEASE.jar:/app/libs/h2-1.4.197.jar:/app/libs/ignite-core-2.9.0.jar:/app/libs/spring-aop-4.3.26.RELEASE.jar:/app:/app/libs/reflections/reflections8-0.11.7.7.jar:/app:/app/libs/cache-1.0.0.0.0:/app/libs/libs/api:/webset.0.0.29/libs/lucene-core-7.4.0.jar:/app/libs/jackson-databind-2.10.1.jar:/app/libs/ignite-spring-2.9.0.jar:/app/libs/grpc-context-1.19.0.jar:/app/libs/lucene-queryparser-7.4.0.jar:/app/libs/spring-web-4.3.29.RELEASE.jar:/app/libs/ignite-shmem-1.0.0.jar:/app/libs/guava-26.0-android.jar:/app/libs/libs/libs-4.3.jar
2020-12-08 22:05:25860[1]调试[(null)]-JVM已启动。
[22:05:26184][INFO][main][XmlBeanDefinitionReader]从URL[file:/app//kubernetes.config]加载xmlbean定义
...
2020-12-08 22:05:37936[70]INFO org.apache.ignite.internal.processors.cache.distributed.dht.preload.GridDhtPartitionDemander[(null)]-完成的未来再平衡:未来再平衡[state=STARTED,grp=CacheGroupContext[grp=ignite sys cache],tover=affinitytopologyproversion[toper=82,minorTopVer=0],再平衡ID=1,例程=4,receivedBytes=1200,receivedKeys=0,partitionsLeft=0,startTime=1607465137846,endTime=-1,LastCanceledTime=-1,next=null]
2020-12-08 22:05:37936[70]DEBUG org.apache.ignite.internal.processors.cache.distributed.dht.preload.GridDhtPartitionDemander[(null)]-已计划重新发送分区[原因=重新平衡完成,grp=ignite sys cache]
2020-12-08 22:05:37937[70]DEBUG org.apache.ignite.internal.processors.cache.distributed.dht.preload.GridDhtPartitionDemander[(null)]-完成重新平衡分区:[grp=ignite sys cache,topVer=AffinityTopologyVersion[topVer=82,minorTopVer=0],supplier=12ca76f0-3286-4779-a426-408d5d6cf226,p=61]
2020-12-08 22:05:37937[70]DEBUG org.apache.ignite.internal.processors.cache.distributed.dht.preload.GridDhtPartitionDemander[(null)]-将不会请求下一条请求消息[grp=ignite sys cache,topVer=AffinityTopologyVersion[topVer=82,minorTopVer=0],supplier=12ca76f0-3286-4779-a426-408d5d6cf226,rebalanceFuture=rebalanceFuture=rebalanceFuture[state=STARTED,grp=CacheGroupContext[grp=ignite sys cache],topVer=AffinityTopologyVersion[topVer=82,minorTopVer=0],rebalanceId=1,routines=4,receivedBytes=1200,receivedKeys=0,partitionsLeft=0,startTime=1607465137846,endTime=1607465137937,lastCanceledTime=-1,next=null]]
2020-12-08 22:05:37943[71]调试org.apache.ignite.internal.processors.odbc.ClientListenerProcessor[(null)]-网格可运行已启动:nio接受器客户端侦听器
2020-12-08 22:05:37944[72]调试org.apache.ignite.internal.processors.odbc.ClientListenerProcessor[(null)]-网格可运行已启动:Grid-nio-worker-client-listener-0
2020-12-08 22:05:37944[1]调试org.apache.ignite.internal.processors.service.IgniteServiceProcessor[(null)]-已启动服务处理器。
2020-12-08 22:05:37954[73]调试组织。