Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/cassandra/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Cassandra 卡桑德拉抛弃了记忆_Cassandra_Garbage Collection_Cassandra 2.1 - Fatal编程技术网

Cassandra 卡桑德拉抛弃了记忆

Cassandra 卡桑德拉抛弃了记忆,cassandra,garbage-collection,cassandra-2.1,Cassandra,Garbage Collection,Cassandra 2.1,在我们的测试环境中,我们有一个1节点的cassandra集群,所有键空间的RF=1 下面列出了感兴趣的VM参数 -XX:+CMSClassUnloadingEnabled-XX:+UseThreadPriorityPolicy-XX:ThreadPriorityPolicy=42-Xms2G-Xmx2G-Xmn1G-XX:+HeapDumpOnAutofmemoryError-Xss256k-XX:StringTableSize=1000003-XX:+UseParNewGC-XX:+UseCo

在我们的测试环境中,我们有一个1节点的cassandra集群,所有键空间的RF=1

下面列出了感兴趣的VM参数

-XX:+CMSClassUnloadingEnabled-XX:+UseThreadPriorityPolicy-XX:ThreadPriorityPolicy=42-Xms2G-Xmx2G-Xmn1G-XX:+HeapDumpOnAutofmemoryError-Xss256k-XX:StringTableSize=1000003-XX:+UseParNewGC-XX:+UseConMarkSweepGC-XX:+CMSParallelRemarkEnabled-XX:SurviorRatio=8

我们注意到完全GC频繁发生,而cassandra在GC过程中没有反应

INFO  [Service Thread] 2016-12-29 15:52:40,901 GCInspector.java:252 - ParNew GC in 238ms.  CMS Old Gen: 782576192 -> 802826248; Par Survivor Space: 60068168 -> 32163264

INFO  [Service Thread] 2016-12-29 15:52:40,902 GCInspector.java:252 - ConcurrentMarkSweep GC in 1448ms.  CMS Old Gen: 802826248 -> 393377248; Par Eden Space: 859045888 -> 0; Par Survivor Space: 32163264 -> 0
我们正在获取java.lang.OutOfMemoryError,但出现以下异常

ERROR [SharedPool-Worker-5] 2017-01-26 09:23:13,694 JVMStabilityInspector.java:94 - JVM state determined to be unstable.  Exiting forcefully due to:
java.lang.OutOfMemoryError: Java heap space
        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57) ~[na:1.7.0_80]
        at java.nio.ByteBuffer.allocate(ByteBuffer.java:331) ~[na:1.7.0_80]
        at org.apache.cassandra.utils.memory.SlabAllocator.getRegion(SlabAllocator.java:137) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.utils.memory.SlabAllocator.allocate(SlabAllocator.java:97) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.utils.memory.ContextAllocator.allocate(ContextAllocator.java:57) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.utils.memory.ContextAllocator.clone(ContextAllocator.java:47) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.utils.memory.MemtableBufferAllocator.clone(MemtableBufferAllocator.java:61) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.db.Memtable.put(Memtable.java:192) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1237) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:400) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:363) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.db.Mutation.apply(Mutation.java:214) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.service.StorageProxy$7.runMayThrow(StorageProxy.java:1033) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2224) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) ~[na:1.7.0_80]
        at org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164) ~[apache-cassandra-2.1.8.jar:2.1.8]
        at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) [apache-cassandra-2.1.8.jar:2.1.8]
        at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
注意:非系统密钥空间没有相同的复制设置,有效的所有权信息没有意义

节点工具信息

ID                     : 32251391-5eee-4891-996d-30fb225116a1
Gossip active          : true
Thrift active          : true
Native Transport active: true
Load                   : 5.74 GB
Generation No          : 1485526088
Uptime (seconds)       : 330651
Heap Memory (MB)       : 812.72 / 1945.63
Off Heap Memory (MB)   : 7.63
Data Center            : DC1
Rack                   : RAC1
Exceptions             : 0
Key Cache              : entries 68, size 6.61 KB, capacity 97 MB, 1158 hits, 1276 requests, 0.908 recent hit rate, 14400 save period in seconds
Row Cache              : entries 0, size 0 bytes, capacity 0 bytes, 0 hits, 0 requests, NaN recent hit rate, 0 save period in seconds
Counter Cache          : entries 0, size 0 bytes, capacity 48 MB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds
Token                  : (invoke with -T/--tokens to see all 256 tokens)
从System.log中,我看到了大量的压缩大分区

WARN  [CompactionExecutor:33463] 2016-12-24 05:42:29,550 SSTableWriter.java:240 - Compacting large partition mydb/Table_Name:2016-12-23 00:00+0530 (142735455 bytes)
WARN  [CompactionExecutor:33465] 2016-12-24 05:47:57,343 SSTableWriter.java:240 - Compacting large partition mydb/Table_Name_2:22:0c2e6c00-a5a3-11e6-a05e-1f69f32db21c (162203393 bytes)
对于墓碑,我在system.log中注意到如下内容

[main]2016-12-28 18:23:06534 YamlConfiguration Loader.java:135-Node 配置:[验证器=密码验证器; authorizer=CassandraAuthorizer;auto_snapshot=true; 批量大小警告阈值(kb=5); batchlog\u replay\u throttle\u in\u kb=1024; cas_争用_超时_in_ms=1000; 客户端加密选项=;集群名称=bankbazaar; 列索引大小为64;提交失败策略为忽略; commitlog_directory=/var/cassandra/log/commitlog; commitlog_段_大小_,单位为mb=32;commitlog_同步=周期性; commitlog\u sync\u period\u in\u ms=10000; 压缩吞吐量每秒为16;并发计数器写入为32; 并发读取=32;并发写入=32; 计数器缓存保存周期=7200;计数器缓存大小=空; 计数器写入请求超时值为15000;交叉节点超时值为false; 数据文件目录=[/cryptfs/sdb/cassandra/data, /cryptfs/sdc/cassandra/data,/cryptfs/sdd/cassandra/data]; 磁盘故障策略=最佳努力;动态故障阈值=0.1; 动态飞贼重置间隔(单位:毫秒)=600000; 动态窃听器更新间隔为100; 端点_snitch=闲聊属性文件snitch; 提示切换已启用=真;提示切换以kb为单位=1024; 增量备份=false;索引\u摘要\u容量\u in\u mb=null; 索引\汇总\调整\间隔\分钟=60; inter_dc_tcp_nodelay=false;节间压缩=all; key\u cache\u save\u period=14400;key\u cache\u size\u in\u mb=null; 侦听地址=127.0.0.1;最大提示窗口=10800000; max\u hints\u delivery\u threads=2;memtable\u allocation\u type=heap\u buffers; 本机\u传输\u端口=9042;数字\u令牌=256; partitioner=org.apache.cassandra.dht.3 partitioner; 权限\u有效性\u在\u ms=2000;范围\u请求\u超时\u在\u ms=20000; 读取请求超时,单位为10000; 请求调度程序=org.apache.cassandra.scheduler.NoScheduler; 请求\u超时\u in\u ms=20000;行\u缓存\u保存\u期间=0; 行缓存大小,单位为mb=0;rpc\U地址=127.0.0.1;rpc\U keepalive=true; rpc_端口=9160;rpc_服务器_类型=同步; saved_caches_directory=/var/cassandra/data/saved_caches; seed_provider=[{class_name=org.apache.cassandra.locator.SimpleSeedProvider, 参数=[{seeds=127.0.0.1}]}]; 服务器加密选项=; 压缩前的快照=false;ssl存储端口=9001; sstable_抢占_打开_间隔_in_mb=50; start_native_transport=true;start_rpc=true;storage_port=9000; 节约型框架运输尺寸单位:mb=15; 逻辑删除故障逻辑删除阈值=100000;逻辑删除警告逻辑删除阈值=1000; trickle\u fsync=false;trickle\u fsync\u interval\u in\u kb=10240; 截断请求超时时间,单位为60000; 写入\u请求\u超时\u in\u ms=5000]

nodetool tpstats

Pool Name                    Active   Pending      Completed   Blocked  All time blocked
CounterMutationStage              0         0              0         0                 0
ReadStage                        32      4061       50469243         0                 0
RequestResponseStage              0         0              0         0                 0
MutationStage                    32        22       27665114         0                 0
ReadRepairStage                   0         0              0         0                 0
GossipStage                       0         0              0         0                 0
CacheCleanupExecutor              0         0              0         0                 0
AntiEntropyStage                  0         0              0         0                 0
MigrationStage                    0         0              0         0                 0
Sampler                           0         0              0         0                 0
ValidationExecutor                0         0              0         0                 0
CommitLogArchiver                 0         0              0         0                 0
MiscStage                         0         0              0         0                 0
MemtableFlushWriter               0         0           7769         0                 0
MemtableReclaimMemory             1        57          13433         0                 0
PendingRangeCalculator            0         0              1         0                 0
MemtablePostFlush                 0         0           9279         0                 0
CompactionExecutor                3        47         169022         0                 0
InternalResponseStage             0         0              0         0                 0
HintedHandoff                     0         1            148         0                 0
是否有任何YAML/其他配置可用于避免“大规模压缩”

正确的压实策略是什么?由于压缩策略错误,可能会导致内存溢出

在其中一个键空间中,我们为每一行写入一次并读取多次

对于另一个键空间,我们有Timeseries类型的数据,其中它是仅插入和多次读取的。看到这一点:堆内存(MB):812.72/1945.63告诉我,您的1台计算机可能电源不足。你很有可能跟不上GC

在这种情况下,我认为这可能与尺寸过小有关——访问模式、数据模型和负载大小也会影响GC,所以如果您想用这些信息更新帖子,我可以更新我的答案以反映这一点

编辑以反映新信息

感谢您添加更多信息。根据您发布的内容,我注意到有两件事情会立即导致您的堆爆炸:

大分区:

看起来压缩必须压缩2个超过100mb的分区(分别为140和160MB)。通常情况下,这仍然是可以的(不是很好),但因为你在一个功率不足的硬件上运行,有这么小的堆,这是相当多的

关于压缩的事情

它在运行时使用健康的资源组合。一切照旧,所以这是你应该测试和计划的事情。在这种情况下,我确信压缩工作会更加困难,因为大型分区正在使用CPU资源(GC需要)、堆和IO

这使我想到另一个问题:

Pool Name                    Active   Pending      Completed   Blocked  All time blocked
CounterMutationStage              0         0              0         0                 0
ReadStage                        32      4061       50469243         0                 0
这通常是一个迹象,表明您需要放大和/或缩小。在你的情况下,你可能想两者兼而有之。使用未优化的数据模型,您可以非常快地耗尽单个动力不足的节点。在单节点环境中进行测试时,您也不会体验到分布式系统的细微差别

TL也是如此;医生:


对于读取繁重的工作负载(看起来是这样),您需要一个更大的堆。对于总体健全性和集群健康状况,您需要重新访问您的数据模型,以确保分区逻辑是合理的。如果您不确定如何或为什么这样做,我建议您在这里花些时间:

您能告诉我如何获得您要求的“访问模式、数据模型和付费”信息吗
Pool Name                    Active   Pending      Completed   Blocked  All time blocked
CounterMutationStage              0         0              0         0                 0
ReadStage                        32      4061       50469243         0                 0