incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nigel Kerr <nigel.k...@gmail.com>
Subject Re: RE 200TB in Cassandra ?
Date Thu, 19 Apr 2012 12:30:32 GMT
Can you say more about how and how often these 200TB get used, queried,
updated?  Is a different usage profile needed?  What kind of column
families do you have in mind for them?


On Thu, Apr 19, 2012 at 8:24 AM, Franc Carter <franc.carter@sirca.org.au>wrote:

> On Thu, Apr 19, 2012 at 10:16 PM, Yiming Sun <yiming.sun@gmail.com> wrote:
>
>> 600 TB is really a lot, even 200 TB is a lot.  In our organization,
>> storage at such scale is handled by our storage team and they purchase
>> specialized (and very expensive) equipment from storage hardware vendors
>> because at this scale, performance and reliability is absolutely critical.
>
>
> Yep that's what we currently do. We have 200TB sitting on a set of high
> end disk arrays which are running RAID6. I'm in the early stages of looking
> at whether this is still the best approach.
>
>
>>
>> but it sounds like your team may not be able to afford such equipment.
>>  600GB per node will require a cloud and you need a data center to house
>> them... but 2TB disks are common place nowadays and you can jam multiple
>> 2TB disks into each node to reduce the number of machines needed.  It all
>> depends on what budget you have.
>>
>
> The bit I am trying to understand is whether my figure of 400TB/node in
> practice for Cassandra is correct, or whether we can push the GB/node
> higher and if so how high
>
> cheers
>
>
>> -- Y.
>>
>>
>> On Thu, Apr 19, 2012 at 7:54 AM, Franc Carter <franc.carter@sirca.org.au>wrote:
>>
>>> On Thu, Apr 19, 2012 at 9:38 PM, Romain HARDOUIN <
>>> romain.hardouin@urssaf.fr> wrote:
>>>
>>>>
>>>> Cassandra supports data compression and depending on your data, you can
>>>> gain a reduction in data size up to 4x.
>>>>
>>>
>>> The data is gzip'd already ;-)
>>>
>>>
>>>> 600 TB is a lot, hence requires lots of servers...
>>>>
>>>>
>>>> Franc Carter <franc.carter@sirca.org.au> a écrit sur 19/04/2012
>>>> 13:12:19 :
>>>>
>>>> > Hi,
>>>> >
>>>> > One of the projects I am working on is going to need to store about
>>>> > 200TB of data - generally in manageable binary chunks. However,
>>>> > after doing some rough calculations based on rules of thumb I have
>>>> > seen for how much storage should be on each node I'm worried.
>>>> >
>>>> >   200TB with RF=3 is 600TB = 600,000GB
>>>> >   Which is 1000 nodes at 600GB per node
>>>> >
>>>> > I'm hoping I've missed something as 1000 nodes is not viable for us.
>>>> >
>>>> > cheers
>>>> >
>>>> > --
>>>> > Franc Carter | Systems architect | Sirca Ltd
>>>> > franc.carter@sirca.org.au | www.sirca.org.au
>>>> > Tel: +61 2 9236 9118
>>>> > Level 9, 80 Clarence St, Sydney NSW 2000
>>>> > PO Box H58, Australia Square, Sydney NSW 1215
>>>
>>>
>>>
>>>
>>> --
>>>
>>> *Franc Carter* | Systems architect | Sirca Ltd
>>>  <marc.zianideferranti@sirca.org.au>
>>>
>>> franc.carter@sirca.org.au | www.sirca.org.au
>>>
>>> Tel: +61 2 9236 9118
>>>
>>> Level 9, 80 Clarence St, Sydney NSW 2000
>>>
>>> PO Box H58, Australia Square, Sydney NSW 1215
>>>
>>>
>>
>
>
> --
>
> *Franc Carter* | Systems architect | Sirca Ltd
>  <marc.zianideferranti@sirca.org.au>
>
> franc.carter@sirca.org.au | www.sirca.org.au
>
> Tel: +61 2 9236 9118
>
> Level 9, 80 Clarence St, Sydney NSW 2000
>
> PO Box H58, Australia Square, Sydney NSW 1215
>
>

Mime
View raw message