incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Franc Carter <franc.car...@sirca.org.au>
Subject Re: RE 200TB in Cassandra ?
Date Thu, 19 Apr 2012 12:24:01 GMT
On Thu, Apr 19, 2012 at 10:16 PM, Yiming Sun <yiming.sun@gmail.com> wrote:

> 600 TB is really a lot, even 200 TB is a lot.  In our organization,
> storage at such scale is handled by our storage team and they purchase
> specialized (and very expensive) equipment from storage hardware vendors
> because at this scale, performance and reliability is absolutely critical.


Yep that's what we currently do. We have 200TB sitting on a set of high end
disk arrays which are running RAID6. I'm in the early stages of looking at
whether this is still the best approach.


>
> but it sounds like your team may not be able to afford such equipment.
>  600GB per node will require a cloud and you need a data center to house
> them... but 2TB disks are common place nowadays and you can jam multiple
> 2TB disks into each node to reduce the number of machines needed.  It all
> depends on what budget you have.
>

The bit I am trying to understand is whether my figure of 400TB/node in
practice for Cassandra is correct, or whether we can push the GB/node
higher and if so how high

cheers


> -- Y.
>
>
> On Thu, Apr 19, 2012 at 7:54 AM, Franc Carter <franc.carter@sirca.org.au>wrote:
>
>> On Thu, Apr 19, 2012 at 9:38 PM, Romain HARDOUIN <
>> romain.hardouin@urssaf.fr> wrote:
>>
>>>
>>> Cassandra supports data compression and depending on your data, you can
>>> gain a reduction in data size up to 4x.
>>>
>>
>> The data is gzip'd already ;-)
>>
>>
>>> 600 TB is a lot, hence requires lots of servers...
>>>
>>>
>>> Franc Carter <franc.carter@sirca.org.au> a écrit sur 19/04/2012
>>> 13:12:19 :
>>>
>>> > Hi,
>>> >
>>> > One of the projects I am working on is going to need to store about
>>> > 200TB of data - generally in manageable binary chunks. However,
>>> > after doing some rough calculations based on rules of thumb I have
>>> > seen for how much storage should be on each node I'm worried.
>>> >
>>> >   200TB with RF=3 is 600TB = 600,000GB
>>> >   Which is 1000 nodes at 600GB per node
>>> >
>>> > I'm hoping I've missed something as 1000 nodes is not viable for us.
>>> >
>>> > cheers
>>> >
>>> > --
>>> > Franc Carter | Systems architect | Sirca Ltd
>>> > franc.carter@sirca.org.au | www.sirca.org.au
>>> > Tel: +61 2 9236 9118
>>> > Level 9, 80 Clarence St, Sydney NSW 2000
>>> > PO Box H58, Australia Square, Sydney NSW 1215
>>
>>
>>
>>
>> --
>>
>> *Franc Carter* | Systems architect | Sirca Ltd
>>  <marc.zianideferranti@sirca.org.au>
>>
>> franc.carter@sirca.org.au | www.sirca.org.au
>>
>> Tel: +61 2 9236 9118
>>
>> Level 9, 80 Clarence St, Sydney NSW 2000
>>
>> PO Box H58, Australia Square, Sydney NSW 1215
>>
>>
>


-- 

*Franc Carter* | Systems architect | Sirca Ltd
 <marc.zianideferranti@sirca.org.au>

franc.carter@sirca.org.au | www.sirca.org.au

Tel: +61 2 9236 9118

Level 9, 80 Clarence St, Sydney NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215

Mime
View raw message