cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Franc Carter <>
Subject Re: Storage management during rapid growth
Date Thu, 31 Oct 2013 21:01:44 GMT
I can't comment on the technical question, however one thing I learnt with
managing the growth of data is that the $/GB of tends to drop at a rate
that can absorb a moderate proportion of the  increase in cost due to the
increase in size of data. I'd recommend having a wet-finger-in-the-air stab
at projecting the growth in data sizes versus the historical trends in the
decease in cost of storage.


On Fri, Nov 1, 2013 at 7:15 AM, Dave Cowen <> wrote:

> Hi, all -
> I'm currently managing a small Cassandra cluster, several nodes with local
> SSD storage.
> It's difficult for to forecast the growth of the Cassandra data over the
> next couple of years for various reasons, but it is virtually guaranteed to
> grow substantially.
> During this time, there may be times where it is desirable to increase the
> amount of storage available to each node, but, assuming we are not I/O
> bound, keep from expanding the cluster horizontally with additional nodes
> that have local storage. In addition, expanding with local SSDs is costly.
> My colleagues and I have had several discussions of a couple of other
> options that don't involve scaling horizontally or adding SSDs:
> 1) Move to larger, cheaper spinning-platter disks. However, when
> monitoring the performance of our cluster, we see sustained periods -
> especially during repair/compaction/cleanup - of several hours where there
> are >2000 IOPS. It will be hard to get to that level of performance in each
> node with spinning platter disks, and we'd prefer not to take that kind of
> performance hit during maintenance operations.
> 2) Move some nodes to a SAN solution, ensuring that there is a mix of
> storage, drives, LUNs and RAIDs so that there isn't a single point of
> failure. While we're aware that this is frowned on in the Cassandra
> community due to Cassandra's design, a SAN seems like the obvious way of
> being able to quickly add storage to a cluster without having to juggle
> local drives, and provides a level of performance between local spinning
> platter drives and local SSDs.
> So, the questions:
> 1) Has anyone moved from SSDs to spinning-platter disks, or managed a
> cluster that contained both? Do the numbers we're seeing exaggerate the
> performance hit we'd see if we moved to spinners?
> 2) Have you successfully used a SAN or a hybrid SAN solution (some local,
> some SAN-based) to dynamically add storage to the cluster? What type of SAN
> have you used, and what issues have you run into?
> 3) Am I missing a way of economically scaling storage?
> Thanks for any insight.
> Dave


*Franc Carter* | Systems architect | Sirca Ltd
 <> |

Tel: +61 2 8355 2514

Level 4, 55 Harrington St, The Rocks NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215

View raw message