cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Slater <ben.sla...@instaclustr.com>
Subject Re: Current data density limits with Open Source Cassandra
Date Wed, 08 Feb 2017 22:14:55 GMT
The major issue we’ve seen with very high density (we generally say <2TB
node is best) is manageability - if you need to replace a node or add node
then restreaming data takes a *long* time and there we fairly high chance
of a glitch in the universe meaning you have to start again before it’s
done.

Also, if you’re uses STCS you can end up with gigantic compactions which
also take a long time and can cause issues.

Heap limitations are mainly related to partition size rather than node
density in my experience.

Cheers
Ben

On Thu, 9 Feb 2017 at 08:20 Hannu Kröger <hkroger@gmail.com> wrote:

> Hello,
>
> Back in the day it was recommended that max disk density per node for
> Cassandra 1.2 was at around 3-5TB of uncompressed data.
>
> IIRC it was mostly because of heap memory limitations? Now that off-heap
> support is there for certain data and 3.x has different data storage
> format, is that 3-5TB still a valid limit?
>
> Does anyone have experience on running Cassandra with 3-5TB compressed
> data ?
>
> Cheers,
> Hannu

-- 
————————
Ben Slater
Chief Product Officer
Instaclustr: Cassandra + Spark - Managed | Consulting | Support
+61 437 929 798

Mime
View raw message