incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill <>
Subject Re: commodity server spec
Date Sat, 03 Sep 2011 23:34:48 GMT
[100% agree with Chris]

China, the machines you're describing sound nice for 
mongodb/postgres/mysql, but probably not the sweetspot for Cassandra.

Obviously (well depending on near term load) you don't want to get 
burned on excess footprint. But a realistic, don't lose data, be fairly 
available deployment is going to span at least 2 racks/power supplies 
and have data replicated offsite (at least as passive for DR). So I 
would consider 6-9 relatively weaker servers rather than 3 scale up 
joints. You'll save some capex, and the amount of opex overhead is 
probably worth it traded off against the operational risk. 3 is an 
awkward number to operate for anything that needs to be available 
(although many people seem to start with that, I am guessing because 
triplication is traditionally understood under failure) as it 
immediately puts 50% extra load on the remaining 2 when one node goes 
away. One will go away, even transiently, when it is upgraded, crashes, 
gets into a funk due to compaction or garbage collection, and load will 
then be shunted onto the other 2 - remember Cassandra has no 
backoff/throttling in place. I'd allow for something breaking at some 
point (dbs even the mature ones, fail from time to time) and 2 doesn't 
give you much room to maneuver in production.


On 03/09/11 23:05, Chris Goffinet wrote:
> It will also depend on how long you can handle recovery time. So imagine
> this case:
> 3 nodes w/ RF of 3
> Each node has 30TB of space used (you never want to fill up entire node).
> If one node fails and you must recover, that will take over 3.6 days in
> just transferring data alone. That's with a sustained 800megabit/s
> (100MB/s). In the real world it's going to fluctuate so add some
> padding. Also, since you will be saturating one of the other nodes, now
> you're network latency performance is suffering and you only have 1
> machine to handle the remaining traffic while you're recovering. And if
> you want to expand the cluster in the future (more nodes), the amount of
> data to transfer is going to be very large and most likely days to add
> machines. From my experience it's must better to have a larger cluster
> setup upfront for future growth than getting by with 6-12 nodes at the
> start. You will feel less pain, easier to manage node failures (bad
> disks, mem, etc).
> 3 nodes with RF of 1 wouldn't make sense.
> On Sat, Sep 3, 2011 at 4:05 AM, China Stoffen <
> <>> wrote:
>     Many small servers would drive up the hosting cost way too high so
>     want to avoid this solution if we can.
>     ----- Original Message -----
>     From: Radim Kolar < <>>
>     To: <>
>     Cc:
>     Sent: Saturday, September 3, 2011 9:37 AM
>     Subject: Re: commodity server spec
>     many smaller servers way better

View raw message