Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A62B5CB13 for ; Thu, 19 Apr 2012 11:41:42 +0000 (UTC) Received: (qmail 11199 invoked by uid 500); 19 Apr 2012 11:41:40 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 11169 invoked by uid 500); 19 Apr 2012 11:41:40 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 11161 invoked by uid 99); 19 Apr 2012 11:41:40 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Apr 2012 11:41:40 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [91.209.25.138] (HELO cer69mx21.cirtil.fr) (91.209.25.138) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 Apr 2012 11:41:34 +0000 X-PJ: In-Reply-To: To: user@cassandra.apache.org Subject: RE 200TB in Cassandra ? MIME-Version: 1.0 From: Romain HARDOUIN Message-ID: Date: Thu, 19 Apr 2012 13:38:49 +0200 Content-Type: multipart/alternative; boundary="=_alternative 004033C8C12579E5_=" X-Virus-Checked: Checked by ClamAV on apache.org Message en plusieurs parties au format MIME --=_alternative 004033C8C12579E5_= Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: quoted-printable Cassandra supports data compression and depending on your data, you can=20 gain a reduction in data size up to 4x. 600 TB is a lot, hence requires lots of servers...=20 Franc Carter a =E9crit sur 19/04/2012 13:12:19 : > Hi, >=20 > One of the projects I am working on is going to need to store about=20 > 200TB of data - generally in manageable binary chunks. However,=20 > after doing some rough calculations based on rules of thumb I have=20 > seen for how much storage should be on each node I'm worried. >=20 > 200TB with RF=3D3 is 600TB =3D 600,000GB > Which is 1000 nodes at 600GB per node >=20 > I'm hoping I've missed something as 1000 nodes is not viable for us. >=20 > cheers >=20 > --=20 > Franc Carter | Systems architect | Sirca Ltd > franc.carter@sirca.org.au | www.sirca.org.au > Tel: +61 2 9236 9118=20 > Level 9, 80 Clarence St, Sydney NSW 2000 > PO Box H58, Australia Square, Sydney NSW 1215 --=_alternative 004033C8C12579E5_= Content-Type: text/html; charset="ISO-8859-1" Content-Transfer-Encoding: quoted-printable
Cassandra supports data compression and depending on your data, you can gain a reduction in data size up to 4x.
600 TB is a lot, hence requires lots of servers...


Franc Carter <franc.carter@sirca.org.au> a =E9= crit sur 19/04/2012 13:12:19 :

> Hi,

>
> One of the projects I am working on is going to need to store about
> 200TB of data - generally in manageable binary chunks. However,
> after doing some rough calculations based on rules of thumb I have
> seen for how much storage should be on each node I'm worried.

>
>   200TB with RF=3D3 is 600TB =3D 600,000GB

>   Which is 1000 nodes at 600GB per node
>
> I'm hoping I've missed something as 1000 nodes is not viable for us.

>
> cheers

>
> --

> Franc Carter | Systems architect | Sirca Ltd
> franc.carter@sirca.org.au | www.sirca= .org.au
> Tel: +61 2 9236 9118
> Level 9, 80 Clarence St, Sydney NSW 2000
> PO Box H58, Australia Square, Sydney NSW 1215 --=_alternative 004033C8C12579E5_=--