cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Schuller <>
Subject Re: nodetool repair caused high disk space usage
Date Fri, 19 Aug 2011 18:52:23 GMT
> There were few Compacted files.  I thought that might have been the cause,
> but it wasn't it.  We have a CF that is 23GB, and while repair is running,
> there are multiple instances of that CF created along with other CFs.

To confirm - are you saying the data directory size is huge, but the
live size as reported by nodetool ring and nodetool info does NOT
reflect this inflated size?

What files *do* you have in the data directory? Any left-over *tmp*
files for example?

Are you sure you're only running a single repair at a time? (Sorry if
this was covered, I did a quick swipe through thread history because I
was unsure whether I was confusing two different threads, and I don't
think so.)

The question is what's taking the space. If it's sstables, they really
should be either compacted onces that are marked for deletion but
being retained, or "live" sstables in which case they should show up
as load in nodetool.

What else... maybe streams are being re-tried from the source nodes
and the disk space is coming from a bunch of half-finished streams of
the same data. But if so, those should be *tmp* files IIRC.

I'm just wildly speculation, but it would be nice to get to the bottom of this.

/ Peter Schuller (@scode on twitter)

View raw message