Our cluster runs with up to 2TB/node (thats the compressed size) and an RF=2. The figure of 400GB/node is by no way a maximum or limit but be generally the point where Cassandra mostly ‘just works’ without too much manual intervention for most workloads. You absolutely can run with more than 400 GB/node – its just important to realize some of the performance and operational implications:
- Adding nodes (specifically changing ring topology) becomes a heavier/longer process (we dont really add nodes that often so not a huge issues)
- Repair becomes more expensive (we have append only workload and dont need to run repair)
- As your data on disk/available ram ratio increases, read performance can become extremely volatile and inconsistent (we use Cassandra as a datasource for internal analyitics – inconsistent read performance is acceptable).
- Beware of the bloom filter JVM memory requirements if you have a large number of rows per node (billions).
- As a somewhat hand wavy final point: I haven’t found Cassandra’s compaction strategies (certainly not leveled compaction) all that well suited or easy to tune for such large data sets – we make extensive use of expiring columns and I typically have to go through about once a month, take down nodes, and manually remove sstables I know have expired (Im excited for things like CASSANDRA-3974).
My main point is you can push Cassandra way beyond 400GB/node (depending on your workload) but I find it a bit more finicky to deal with. As with most things – you should probably just try it out in a smaller scale prototype (a couple of nodes).
On Thu, Apr 19, 2012 at 10:16 PM, Yiming Sun <email@example.com> wrote:
600 TB is really a lot, even 200 TB is a lot. In our organization, storage at such scale is handled by our storage team and they purchase specialized (and very expensive) equipment from storage hardware vendors because at this scale, performance and reliability is absolutely critical.
Yep that's what we currently do. We have 200TB sitting on a set of high end disk arrays which are running RAID6. I'm in the early stages of looking at whether this is still the best approach.
but it sounds like your team may not be able to afford such equipment. 600GB per node will require a cloud and you need a data center to house them... but 2TB disks are common place nowadays and you can jam multiple 2TB disks into each node to reduce the number of machines needed. It all depends on what budget you have.
The bit I am trying to understand is whether my figure of 400TB/node in practice for Cassandra is correct, or whether we can push the GB/node higher and if so how high
On Thu, Apr 19, 2012 at 7:54 AM, Franc Carter <firstname.lastname@example.org> wrote:
On Thu, Apr 19, 2012 at 9:38 PM, Romain HARDOUIN <email@example.com> wrote:
Cassandra supports data compression and depending on your data, you can gain a reduction in data size up to 4x.
The data is gzip'd already ;-)
600 TB is a lot, hence requires lots of servers...
Franc Carter <firstname.lastname@example.org> a écrit sur 19/04/2012 13:12:19 :
> One of the projects I am working on is going to need to store about
> 200TB of data - generally in manageable binary chunks. However,
> after doing some rough calculations based on rules of thumb I have
> seen for how much storage should be on each node I'm worried.
> 200TB with RF=3 is 600TB = 600,000GB
> Which is 1000 nodes at 600GB per node
> I'm hoping I've missed something as 1000 nodes is not viable for us.
> Franc Carter | Systems architect | Sirca Ltd
> email@example.com | www.sirca.org.au
> Tel: +61 2 9236 9118
> Level 9, 80 Clarence St, Sydney NSW 2000
> PO Box H58, Australia Square, Sydney NSW 1215
Tel: +61 2 9236 9118
Level 9, 80 Clarence St, Sydney NSW 2000
PO Box H58, Australia Square, Sydney NSW 1215
No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.929 / Virus Database: 271.1.1/4946 - Release Date: 04/19/12 02:34:00