cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <>
Subject Re: Practical node size limits
Date Sun, 09 Sep 2012 22:41:24 GMT
> The bottleneck now seems to be the repair time. If any node becomes too inconsistent,
or needs to be replaced, the rebuilt time is over a week.

This is why i've recommended 300GB to 400GB per node in the past. It's not a hard limit, but
it seems to be a nice balance. You need to take into consideration compaction, repair, backup
/ restore, node replacement, upgrading, disaster recovery. 

That said, compression, SSD's and faster networking may mean you can run more data per node.
Also Virtual Nodes coming in 1.2X will increase the parallelism of repairing / bootstrapping
a node. (It wont help reduce the time taken to calculate merkle trees though). 


Aaron Morton
Freelance Developer

On 7/09/2012, at 7:28 AM, Dustin Wenz <> wrote:

> This is actually another problem that we've encountered with Cassandra; the range of
platforms it can be deployed on is fairly limited. If you want to run with Oracle's JRE (which
is apparently recommended), you are pretty much stuck with Linux on x86/64 (I haven't tried
the new JDK on ARM yet, but it sounds promising). You could probably do ok on Solaris, too,
with a custom Snappy jar and some JNA concessions.
> 	- .Dustin
> On Sep 5, 2012, at 10:36 PM, Rob Coli <> wrote:
>> On Sun, Jul 29, 2012 at 7:40 PM, Dustin Wenz <> wrote:
>>> We've just set up a new 7-node cluster with Cassandra 1.1.2 running under OpenJDK6.
>> It's worth noting that Cassandra project recommends Sun JRE. Without
>> the Sun JRE, you might not be able to use JAMM to determine the live
>> ratio. Very few people use OpenJDK in production, so using it also
>> increases the likelihood that you might be the first to encounter a
>> given issue. FWIW!
>> =Rob
>> -- 
>> =Robert Coli
>> YAHOO - rcoli.palominob
>> SKYPE - rcoli_palominodb

View raw message