cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rustam Aliyev <>
Subject Re: Cassandra and disk space
Date Thu, 09 Dec 2010 18:20:10 GMT
Is there any plans to improve this in future?

For big data clusters this could be very expensive. Based on your 
comment, I will need 200TB of storage for 100TB of data to keep 
Cassandra running.


On 09/12/2010 17:56, Tyler Hobbs wrote:
> If you are on 0.6, repair is particularly dangerous with respect to 
> disk space usage.  If your replica is sufficiently out of sync, you 
> can triple your disk usage pretty easily.  This has been improved in 
> 0.7, so repairs should use about half as much disk space, on average.
> In general, yes, keep your nodes under 50% disk usage at all times.  
> Any of: compaction, cleanup, snapshotting, repair, or bootstrapping 
> (the latter two are improved in 0.7) can double your disk usage 
> temporarily.
> You should plan to add more disk space or add nodes when you get close 
> to this limit.  Once you go over 50%, it's more difficult to add 
> nodes, at least in 0.6.
> - Tyler
> On Thu, Dec 9, 2010 at 11:19 AM, Mark < 
> <>> wrote:
>     I recently ran into a problem during a repair operation where my
>     nodes completely ran out of space and my whole cluster was...
>     well, clusterfucked.
>     I want to make sure how to prevent this problem in the future.
>     Should I make sure that at all times every node is under 50% of
>     its disk space? Are there any normal day-to-day operations that
>     would cause the any one node to double in size that I should be
>     aware of? If on or more nodes to surpass the 50% mark, what should
>     I plan to do?
>     Thanks for any advice

View raw message