Cassandra—读取请求中的磁盘寻道数

Cassandra—读取请求中的磁盘寻道数,cassandra,Cassandra,我试图了解Cassandra中读取操作所需的最大磁盘搜索数。我看了几篇在线文章,包括这篇: 据我所知,在最坏的情况下需要两次磁盘寻道。一个用于读取分区索引,另一个用于从压缩分区读取实际数据。压缩分区中的数据索引是从压缩偏移表(存储在内存中)获得的。我走对了吗?是否会出现需要一次以上的磁盘搜索来读取数据的情况?我在这里发布了我从Cassandra用户社区线程收到的答案,以防其他人需要它: youre right – one seek with hit in the partition key ca

我试图了解Cassandra中读取操作所需的最大磁盘搜索数。我看了几篇在线文章,包括这篇:


据我所知,在最坏的情况下需要两次磁盘寻道。一个用于读取分区索引,另一个用于从压缩分区读取实际数据。压缩分区中的数据索引是从压缩偏移表(存储在内存中)获得的。我走对了吗?是否会出现需要一次以上的磁盘搜索来读取数据的情况?

我在这里发布了我从Cassandra用户社区线程收到的答案,以防其他人需要它:

youre right – one seek with hit in the partition key cache and two if not.
Thats the theory – but two thinge to mention:

First, you need two seeks per sstable not per entire read. So if you data is spread over multiple sstables on disk you obviously need more then two reads. Think of often updated partition keys – in combination with memory preassure you can easily end up with maaany sstables (ok they will be compacted some time in the future).

Second, there could be fragmentation on disk which leads to seeks during sequential reads.

Note: Each SSTable has it's own partition index.