cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Janne Jalkanen <>
Subject Re: Best strategy for adding new nodes to the cluster
Date Tue, 28 Sep 2010 09:19:16 GMT

On 28 Sep 2010, at 08:37, Michael Dürgner wrote:

>> What do you mean by "running live"? I am also planning to use cassandra on EC2 using
small nodes. Small nodes have 1/4 cpu of the large ones, 1/4 cost, but I/O is more than 1/4
(amazon does not give explicit I/O numbers...), so I think 4 small instances should perform
better than 1 large one (and the cost is the same), am I wrong?
> Based on results we saw and what you also find in different sources around the web, EC2
small instances perform worse than 1/4 regarding IO performance.

Ditto. My tests indicate that while the peak IO performance of small nodes can be ok (up to
1/2 of large), it degrades over time down to 1/6 or even less. It seems that Amazon dedicates
sufficient bandwidth to small nodes in the beginning to ensure a smooth and quick boot, but
then throttles down fairly aggressively within a few minutes.  This seems to affect reads
more than writes, though.

Note also that large instances have over 4x the memory (1.7 GB => 7.5 GB), and that makes
a world of difference (you can have larger caches, for example). You don't really want to
start swapping on the small instances.

(However, small instances are awesome for doing testing and learning how to manage a cluster.)

View raw message