cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Huy Le <>
Subject Re: nodetool repair caused high disk space usage
Date Fri, 19 Aug 2011 18:42:00 GMT
There were few Compacted files.  I thought that might have been the cause,
but it wasn't it.  We have a CF that is 23GB, and while repair is running,
there are multiple instances of that CF created along with other CFs.

I checked the stream directory across cluster of four nodes, but it was

I can not reproduce this issue in version 0.6.11 with a copy of data backed
up prior to 0.8.4 upgrade.

Currently repair is still running on two 0.6.11 nodes.  My plan is to run
compact across the cluster running 0.6.11.  When done, make another attempt
to upgrade it to 0.8.4.


On Fri, Aug 19, 2011 at 2:26 PM, Peter Schuller <
> wrote:

> > After upgrading to cass 0.8.4 from cass 0.6.11.  I ran scrub.  That
> worked
> > fine.  Then I ran nodetool repair on one of the nodes.  The disk usage on
> > data directory increased from 40GB to 480GB, and it's still growing.
> If you check your data directory, does it contain a lot of
> "*Compacted" files? It sounds like you're churning sstables from a
> combination of compactions/flushes (including triggered by repair) and
> the old ones aren't being deleted. I wonder if there is still some
> issue causing sstable retention
> Since you're on 0.8.4, I'm a bit suspicious. I'd have to re-check each
> JIRA but I think the major known repair problems should be fixed
> except for CASSANDRA-2280 which is not your problem since you're going
> form a total load of 40  gig to hundreds of gigs (so even with all
> cf:s streaming, that's unexpected).
> Do you have any old left-over streams active on the nodes? "nodetool
> netstats". If there are "stuck" streams, they might be causing sstable
> retention beyond what you'd expect.
> --
> / Peter Schuller (@scode on twitter)

Huy Le
Spring Partners, Inc.

View raw message