Python-cql-Cassandra 1.2-rpc读取时超时

Python-cql-Cassandra 1.2-rpc读取时超时,python,cassandra,cql,Python,Cassandra,Cql,我有一个使用Cassandra 1.2集群的Python应用程序。集群有7个使用虚拟节点的物理节点,其中一个密钥空间的复制因子为3,另一个密钥空间的复制因子为1。该应用程序使用cql库连接到Cassandra并运行查询。问题是,在尝试对数据库运行selects时,我开始出现错误,出现以下错误: Request did not complete within rpc_timeout 当我检查集群的状态时,我可以看到我的一个节点的cpu使用率超过100%,并且检查Cassandra system.

我有一个使用Cassandra 1.2集群的Python应用程序。集群有7个使用虚拟节点的物理节点,其中一个密钥空间的复制因子为3,另一个密钥空间的复制因子为1。该应用程序使用cql库连接到Cassandra并运行查询。问题是,在尝试对数据库运行selects时,我开始出现错误,出现以下错误:

Request did not complete within rpc_timeout
当我检查集群的状态时,我可以看到我的一个节点的cpu使用率超过100%,并且检查Cassandra system.log时,我可以看到它一直在弹出:

 INFO [ScheduledTasks:1] 2013-06-07 02:02:01,640 StorageService.java (line 3565) Unable to reduce heap usage since there are no dirty column families
 INFO [ScheduledTasks:1] 2013-06-07 02:02:02,642 GCInspector.java (line 119) GC for ConcurrentMarkSweep: 630 ms for 1 collections, 948849672 used; max is 958398464
 WARN [ScheduledTasks:1] 2013-06-07 02:02:02,643 GCInspector.java (line 142) Heap is 0.9900367202591844 full.  You may need to reduce memtable and/or cache sizes.  Cassandra will now flush up to the two largest memtables to free up memory.  Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically
 INFO [ScheduledTasks:1] 2013-06-07 02:02:02,685 StorageService.java (line 3565) Unable to reduce heap usage since there are no dirty column families
 INFO [ScheduledTasks:1] 2013-06-07 02:02:04,224 GCInspector.java (line 119) GC for ConcurrentMarkSweep: 1222 ms for 2 collections, 931216176 used; max is 958398464
 WARN [ScheduledTasks:1] 2013-06-07 02:02:04,224 GCInspector.java (line 142) Heap is 0.9716378009554072 full.  You may need to reduce memtable and/or cache sizes.  Cassandra will now flush up to the two largest memtables to free up memory.  Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically
 INFO [ScheduledTasks:1] 2013-06-07 02:02:04,225 StorageService.java (line 3565) Unable to reduce heap usage since there are no dirty column families
 INFO [ScheduledTasks:1] 2013-06-07 02:02:05,226 GCInspector.java (line 119) GC for ConcurrentMarkSweep: 709 ms for 1 collections, 942735576 used; max is 958398464
 WARN [ScheduledTasks:1] 2013-06-07 02:02:05,227 GCInspector.java (line 142) Heap is 0.9836572275641711 full.  You may need to reduce memtable and/or cache sizes.  Cassandra will now flush up to the two largest memtables to free up memory.  Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically
 INFO [ScheduledTasks:1] 2013-06-07 02:02:05,229 StorageService.java (line 3565) Unable to reduce heap usage since there are no dirty column families
 INFO [ScheduledTasks:1] 2013-06-07 02:02:06,946 GCInspector.java (line 119) GC for ConcurrentMarkSweep: 1271 ms for 2 collections, 939532792 used; max is 958398464
 WARN [ScheduledTasks:1] 2013-06-07 02:02:06,946 GCInspector.java (line 142) Heap is 0.980315419203343 full.  You may need to reduce memtable and/or cache sizes.  Cassandra will now flush up to the two largest memtables to free up memory.  Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically
有没有办法解决这个问题


提前谢谢

您使用什么类型的分区器?您的数据模式是什么?您有多少条记录?您的查询应该返回多少条记录?这些都是我们应该知道的参数,以便为您的问题找到正确答案

在Cassandra的例子中,数据结构设计非常重要,Cassandra不像RDBMS数据库那样可以轻松地在每一列上创建索引,Cassandra列族必须以一种在群集节点之间平均分布数据的方式定义,以避免热点或仅从一个群集节点读取数据,我认为这可能是你的rpc超时的原因

如果您需要更多信息,请发送更多信息。 谢谢


我希望这能对您有所帮助。

看起来Cassandra JVM堆大小可能太小,只有1Gb:

max is 958398464
我建议将堆增加到至少2Gb,假设节点上有可用内存

请参阅cassandra-env.sh,了解如何计算JVM堆分配,或手动将其设置为特定值