Java GC CMS暂停时间随时间增加

Java GC CMS暂停时间随时间增加,java,garbage-collection,Java,Garbage Collection,我们遇到了一个问题,即大约超过8小时的间隔。我们的节点GC暂停时间会增加,最终会得到一个最长的暂停时间~0.8-1.2秒(完整GC)…然后它会下降到~0.1秒或,并恢复8小时。循环 我们在2个芯盒和4个芯盒上看到的~8小时内的波形一致,4个芯盒的平均停顿时间较小,但峰值相同,即在我们的2个芯盒上同样糟糕 很明显,当GC运行时,每次运行都做得不够,因为我们继续看到堆/暂停时间在8小时内增加。因此,我的问题是,我们如何强制GC循环更有效。而不是等到它需要一个长的完整GC循环。(我想将堆切割一半..会

我们遇到了一个问题,即大约超过8小时的间隔。我们的节点GC暂停时间会增加,最终会得到一个最长的暂停时间~0.8-1.2秒(完整GC)…然后它会下降到~0.1秒或,并恢复8小时。循环

我们在2个芯盒和4个芯盒上看到的~8小时内的波形一致,4个芯盒的平均停顿时间较小,但峰值相同,即在我们的2个芯盒上同样糟糕

很明显,当GC运行时,每次运行都做得不够,因为我们继续看到堆/暂停时间在8小时内增加。因此,我的问题是,我们如何强制GC循环更有效。而不是等到它需要一个长的完整GC循环。(我想将堆切割一半..会迫使它更频繁地运行..希望导致“一半”的暂停时间峰值..但这并不理想)

(顺便说一句,我们在Java8,4GB机器上运行这些JVM)。 创建的对象通常是短暂的(系统本质上代表一个队列—大量对象输入,大量对象输出..通常在几秒钟内..其只是恒定的对象流,cpu使用率“低”)

我们的GC参数

-XX:+CMSClassUnloadingEnabled
-XX:+UseThreadPriorities
-XX:ThreadPriorityPolicy=42
-Xms3000M
-Xmx3000M
-Xmn1500M
-XX:+HeapDumpOnOutOfMemoryError
-Xss256k
-XX:StringTableSize=1000003
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=8
-XX:MaxTenuringThreshold=1
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+UseTLAB
-XX:+PerfDisableSharedMem
-XX:CompileCommandFile=./../conf/hotspot_compiler
-XX:CMSWaitDuration=10000
-XX:+CMSParallelInitialMarkEnabled
-XX:+CMSEdenChunksRecordAlways
-XX:CMSWaitDuration=10000
-XX:+UseCondCardMark
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:+PrintHeapAtGC
-XX:+PrintTenuringDistribution
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintPromotionFailure
-Xloggc:/logs/gc.log
-XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles=5
-XX:GCLogFileSize=100M
两个事件的GC日志片段显示了长GC(以及cms remark final上的长拖拽字符串表),第二个事件我们也看到了我认为非常长的类卸载时间(我们不做任何类的动态加载)


看起来你实习的字符串太多了。为什么需要100万的实习生字符串表?不应扣留过早死亡并可由年轻人收集清理的绳子。只有最有可能在旧一代收集中保存下来的字符串才应该被扣押。或者你需要通过插入这么多字符串来节省内存吗?

什么版本的Java?Java1.8?次要版本?我建议改用G1收集器。如果你有GC日志,那么你也应该发布它们。捕捉统计数据,然后在提问时保留它们是愚蠢的。公平地说,我可以在周一提供一个GC日志。我们尝试了G1。更糟糕的是:(Java 1.8各种版本,我们有多种版本供测试。所有版本都展示了相同的模式。我们也将在周一获得这些详细信息)我从4个核心框之一上传了一个gc日志,在这里可以看到Z字形模式,在峰值处有长时间的暂停…当然,我们希望避免长时间的暂停:我注意到的是,CMS的最终备注阶段似乎将其大部分时间花在了擦洗字符串表上,0.3960129秒。这通常代表60-70%的长时间暂停,因此现在使用:-XX:+parallelRefProceEnabledin重新运行测试,在不要求提问者澄清的情况下,答案应该是有用的。答案中的问题可能应该是评论。
2018-07-17T06:20:43.738+0000: 33759.362: [GC (CMS Initial Mark) [1 CMS-initial-mark: 1154831K(1536000K)] 1165177K(2918400K), 0.0048510 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] 
2018-07-17T06:20:43.743+0000: 33759.367: Total time for which application threads were stopped: 0.0082435 seconds
2018-07-17T06:20:43.743+0000: 33759.367: [CMS-concurrent-mark-start]
2018-07-17T06:20:43.836+0000: 33759.460: [CMS-concurrent-mark: 0.093/0.093 secs] [Times: user=0.11 sys=0.00, real=0.09 secs] 
2018-07-17T06:20:43.836+0000: 33759.460: [CMS-concurrent-preclean-start]
2018-07-17T06:20:43.844+0000: 33759.468: [CMS-concurrent-preclean: 0.007/0.008 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] 
2018-07-17T06:20:43.844+0000: 33759.468: [CMS-concurrent-abortable-preclean-start]
 CMS: abort preclean due to time 2018-07-17T06:20:48.887+0000: 33764.511: [CMS-concurrent-abortable-preclean: 4.699/5.043 secs] [Times: user=5.58 sys=0.02, real=5.04 secs] 

 2018-07-17T06:20:48.892+0000: 33764.517: [GC (CMS Final Remark) 
    [YG occupancy: 89219 K (1382400 K)]
    33764.517: [Rescan (parallel) , 0.0256303 secs]
    33764.543: [weak refs processing, 0.0124976 secs]
    33764.555: [class unloading, 0.0593605 secs]
    33764.615: [scrub symbol table, 0.0028103 secs]
    33764.617: [scrub string table, 0.3960129 secs]
    [1 CMS-remark: 1154831K(1536000K)] 
    1244051K(2918400K), 0.5055637 secs] 
    [Times: user=0.58 sys=0.00, real=0.50 secs] 
2018-07-17T06:20:49.399+0000: 33765.023: Total time for which application threads were stopped: 0.5111839 seconds

2018-07-17T06:20:49.399+0000: 33765.023: [CMS-concurrent-sweep-start]
2018-07-17T06:20:49.408+0000: 33765.032: Total time for which application threads were stopped: 0.0034313 seconds
2018-07-17T06:20:49.411+0000: 33765.035: Total time for which application threads were stopped: 0.0022739 seconds
2018-07-17T06:20:49.415+0000: 33765.039: Total time for which application threads were stopped: 0.0022818 seconds
2018-07-17T06:20:49.418+0000: 33765.042: Total time for which application threads were stopped: 0.0020566 seconds
2018-07-17T06:20:49.420+0000: 33765.044: Total time for which application threads were stopped: 0.0017496 seconds
2018-07-17T06:20:49.423+0000: 33765.047: Total time for which application threads were stopped: 0.0022639 seconds
2018-07-17T06:20:49.425+0000: 33765.049: Total time for which application threads were stopped: 0.0016516 seconds
2018-07-17T06:20:49.427+0000: 33765.051: Total time for which application threads were stopped: 0.0018342 seconds
2018-07-17T06:20:49.429+0000: 33765.053: Total time for which application threads were stopped: 0.0015327 seconds
2018-07-17T06:20:49.430+0000: 33765.054: Total time for which application threads were stopped: 0.0015723 seconds
2018-07-17T06:20:49.432+0000: 33765.056: Total time for which application threads were stopped: 0.0015356 seconds
2018-07-17T06:20:49.433+0000: 33765.058: Total time for which application threads were stopped: 0.0014560 seconds
2018-07-17T06:20:49.436+0000: 33765.060: Total time for which application threads were stopped: 0.0019394 seconds
2018-07-17T06:20:49.437+0000: 33765.061: Total time for which application threads were stopped: 0.0015594 seconds
2018-07-17T06:20:49.460+0000: 33765.084: Total time for which application threads were stopped: 0.0028257 seconds
2018-07-17T06:20:50.464+0000: 33766.088: Total time for which application threads were stopped: 0.0031651 seconds
2018-07-17T06:20:50.702+0000: 33766.326: [CMS-concurrent-sweep: 1.269/1.303 secs] [Times: user=1.58 sys=0.00, real=1.31 secs] 
2018-07-17T06:20:50.702+0000: 33766.326: [CMS-concurrent-reset-start]
2018-07-17T06:20:50.706+0000: 33766.330: [CMS-concurrent-reset: 0.004/0.004 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
2018-07-17T06:20:51.469+0000: 33767.093: Total time for which application threads were stopped: 0.0046566 seconds



2018-07-22T19:38:00.619+0000: 513596.243: [GC (CMS Final Remark) 
     [YG occupancy: 56180 K (1382400 K)]
        513596.243: [Rescan (parallel) , 0.0230426 secs]
        513596.266: [weak refs processing, 0.0272667 secs]
        513596.293: [class unloading, 0.1770894 secs]
        513596.471: [scrub symbol table, 0.0101295 secs]
        513596.481: [scrub string table, 0.3882333 secs]
        [1 CMS-remark: 1152442K(1536000K)] 1208623K(2918400K), 0.6355867 secs] [Times: user=0.56 sys=0.00, real=0.63 secs] 
2018-07-22T19:38:01.255+0000: 513596.879: Total time for which application threads were stopped: 0.6396559 seconds


2018-07-22T19:37:55.241+0000: 513590.865: [GC (CMS Initial Mark) [1 CMS-initial-mark: 1152442K(1536000K)] 1173594K(2918400K), 0.0309287 secs] [Times: user=0.02 sys=0.00, real=0.03 secs] 
2018-07-22T19:37:55.272+0000: 513590.896: Total time for which application threads were stopped: 0.0344867 seconds
2018-07-22T19:37:55.272+0000: 513590.896: [CMS-concurrent-mark-start]
2018-07-22T19:37:55.443+0000: 513591.067: [CMS-concurrent-mark: 0.170/0.170 secs] [Times: user=0.16 sys=0.00, real=0.17 secs] 
2018-07-22T19:37:55.443+0000: 513591.067: [CMS-concurrent-preclean-start]
2018-07-22T19:37:55.451+0000: 513591.075: [CMS-concurrent-preclean: 0.007/0.008 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] 
2018-07-22T19:37:55.451+0000: 513591.075: [CMS-concurrent-abortable-preclean-start]
 CMS: abort preclean due to time 2018-07-22T19:38:00.615+0000: 513596.239: [CMS-concurrent-abortable-preclean: 3.434/5.164 secs] [Times: user=4.10 sys=0.11, real=5.16 secs] 

2018-07-22T19:38:00.619+0000: 513596.243: [GC (CMS Final Remark) 
     [YG occupancy: 56180 K (1382400 K)]
        513596.243: [Rescan (parallel) , 0.0230426 secs]
        513596.266: [weak refs processing, 0.0272667 secs]
        513596.293: [class unloading, 0.1770894 secs]
        513596.471: [scrub symbol table, 0.0101295 secs]
        513596.481: [scrub string table, 0.3882333 secs]
        [1 CMS-remark: 1152442K(1536000K)] 1208623K(2918400K), 0.6355867 secs] [Times: user=0.56 sys=0.00, real=0.63 secs] 

2018-07-22T19:38:01.255+0000: 513596.879: Total time for which application threads were stopped: 0.6396559 seconds
2018-07-22T19:38:01.255+0000: 513596.880: [CMS-concurrent-sweep-start]
2018-07-22T19:38:02.209+0000: 513597.833: [CMS-concurrent-sweep: 0.953/0.953 secs] [Times: user=1.10 sys=0.00, real=0.96 secs] 
2018-07-22T19:38:02.209+0000: 513597.833: [CMS-concurrent-reset-start]
2018-07-22T19:38:02.213+0000: 513597.837: [CMS-concurrent-reset: 0.003/0.003 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
2018-07-22T19:38:02.260+0000: 513597.884: Total time for which application threads were stopped: 0.0045430 seconds