incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <sylv...@yakaz.com>
Subject Re: disks and data files
Date Mon, 13 Dec 2010 11:52:24 GMT
On Mon, Dec 13, 2010 at 12:29 PM, shimi <shimi.k@gmail.com> wrote:
> I am reading the kafka design documentation
> (http://sna-projects.com/kafka/design.php) and I came across this (under
> constant time suffices) :
> Intuitively a persistent queue could be built on simple reads and appends to
> files as is commonly the case with logging solutions. Though this structure
> would not support the rich semantics of a BTree implementation, but it has
> the advantage that all operations are O(1) and reads do not block writes or
> each other. This has obvious performance advantages since the performance is
> completely decoupled from the data size--one server can now take full
> advantage of a number of cheap, low-rotational speed 1+TB SATA drives.
> Though they have poor seek performance, these drives often have comparable
> performance for large reads and writes at 1/3 the price and 3x the capacity.
> It is right to say that Cassandra takes advantage of this? the commit log
> write is using append and sstables are only read after they were written.

Most of it applies to Cassandra, yes. However the read pattern they are talking
about (I haven't read the linked article so note that I'm only
referring to what you
have copy-pasted) is the one of a queue which is fairly specific (it
barely needs
seeks). This is not true of much read patterns, so in practice you may
not want to
push too far the "let's use disk with poor seek performance" part (it
obviously depends
on you needs and access pattern but as is always the case).

--
Sylvain

Mime
View raw message