incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Scott Dworkis <...@mylife.com>
Subject Re: Cassandra and disk space
Date Thu, 09 Dec 2010 20:13:28 GMT
i recently finished a practice expansion of 4 nodes to 5 nodes, a series 
of "nodetool move", "nodetool cleanup" and jmx gc steps.  i found that in 
some of the steps, disk usage actually grew to 2.5x the base data size on 
one of the nodes.  i'm using 0.6.4.

-scott

On Thu, 9 Dec 2010, Rustam Aliyev wrote:

> Is there any plans to improve this in future?
> 
> For big data clusters this could be very expensive. Based on your comment, I will need
200TB of storage for 100TB of data to keep Cassandra running.
> 
> --
> Rustam.
> 
> On 09/12/2010 17:56, Tyler Hobbs wrote:
>       If you are on 0.6, repair is particularly dangerous with respect to disk space
usage.  If your replica is sufficiently out of sync, you can
>       triple your disk usage pretty easily.  This has been improved in 0.7, so repairs
should use about half as much disk space, on average.
>
>       In general, yes, keep your nodes under 50% disk usage at all times.  Any of: compaction,
cleanup, snapshotting, repair, or bootstrapping (the
>       latter two are improved in 0.7) can double your disk usage temporarily.
>
>       You should plan to add more disk space or add nodes when you get close to this
limit.  Once you go over 50%, it's more difficult to add nodes,
>       at least in 0.6.
>
>       - Tyler
>
>       On Thu, Dec 9, 2010 at 11:19 AM, Mark <static.void.dev@gmail.com> wrote:
>             I recently ran into a problem during a repair operation where my nodes completely
ran out of space and my whole cluster was...
>             well, clusterfucked.
>
>             I want to make sure how to prevent this problem in the future.
>
>             Should I make sure that at all times every node is under 50% of its disk
space? Are there any normal day-to-day operations that
>             would cause the any one node to double in size that I should be aware of?
If on or more nodes to surpass the 50% mark, what should
>             I plan to do?
>
>             Thanks for any advice
> 
> 
> 
>
Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message