cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: 200TB in Cassandra ?
Date Thu, 19 Apr 2012 20:27:25 GMT
Couple of ideas:

* take a look at compression in 1.X http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-compression
* is there repetition in the binary data ? Can you save space by implementing content addressable
storage ? 
 
Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 20/04/2012, at 12:55 AM, Dave Brosius wrote:

> I think your math is 'relatively' correct. It would seem to me you should focus on how
you can reduce the amount of storage you are using per item, if at all possible, if that node
count is prohibitive.
> 
> On 04/19/2012 07:12 AM, Franc Carter wrote:
>> 
>> 
>> Hi,
>> 
>> One of the projects I am working on is going to need to store about 200TB of data
- generally in manageable binary chunks. However, after doing some rough calculations based
on rules of thumb I have seen for how much storage should be on each node         I'm worried.
>> 
>>   200TB with RF=3 is 600TB = 600,000GB
>>   Which is 1000 nodes at 600GB per node
>> 
>> I'm hoping I've missed something as 1000 nodes is not viable for us.
>> 
>> cheers
>> 
>> -- 
>> Franc Carter | Systems architect | Sirca Ltd
>> franc.carter@sirca.org.au | www.sirca.org.au
>> Tel: +61 2 9236 9118 
>> Level 9, 80 Clarence St, Sydney NSW 2000
>> PO Box H58, Australia Square, Sydney NSW 1215
>> 
> 


Mime
View raw message