incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <>
Subject Re: cassandra vs. mongodb quick question
Date Mon, 18 Feb 2013 20:39:49 GMT
My experience is repair of 300GB compressed data takes longer than 300GB of uncompressed, but
I cannot point to an exact number. Calculating the differences is mostly CPU bound and works
on the non compressed data. 

Streaming uses compression (after uncompressing the on disk data).

So if you have 300GB of compressed data, take a look at how long repair takes and see if you
are comfortable with that. You may also want to test replacing a node so you can get the procedure
documented and understand how long it takes.  

The idea of the soft 300GB to 500GB limit cam about because of a number of cases where people
had 1 TB on a single node and they were surprised it took days to repair or replace. If you
know how long things may take, and that fits in your operations then go with it. 

Aaron Morton
Freelance Cassandra Developer
New Zealand


On 18/02/2013, at 10:08 PM, Vegard Berget <> wrote:

> Just out of curiosity :
> When using compression, does this affect this one way or another?  Is 300G (compressed)
SSTable size, or total size of data?   
> .vegard,
> ----- Original Message -----
> From:
> To:
> <>
> Cc:
> Sent:
> Mon, 18 Feb 2013 08:41:25 +1300
> Subject:
> Re: cassandra vs. mongodb quick question
> If you have spinning disk and 1G networking and no virtual nodes, I would still say 300G
to 500G is a soft limit. 
> If you are using virtual nodes, SSD, JBOD disk configuration or faster networking you
may go higher. 
> The limiting factors are the time it take to repair, the time it takes to replace a node,
the memory considerations for 100's of millions of rows. If you the performance of those operations
is acceptable to you, then go crazy. 
> Cheers
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
> @aaronmorton
> On 16/02/2013, at 9:05 AM, "Hiller, Dean" <> wrote:
> So I found out mongodb varies their node size from 1T to 42T per node depending on the
profile.  So if I was going to be writing a lot but rarely changing rows, could I also use
cassandra with a per node size of +20T or is that not advisable?
> Thanks,
> Dean

View raw message