cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Schuller <peter.schul...@infidyne.com>
Subject Re: Load Vs Disk Space Usage
Date Thu, 18 Nov 2010 21:04:33 GMT
> We're playing around with Cassandra trying to get a feel for it. Can someone please explain
the difference between load (from nodetool) and whats actually stored on disk? Sometimes these
number mirror each other and sometimes the disk usage is up to 2x the load reported. as you
can see below...
[snip]
> Run 3,5,6,9, and 12 don't seem to match up well. Can someone explain this please?

Probably there are obsolete sstables that have not yet been removed.
Removal of sstables is somewhat delayed because it relies on GC to
avoid synchronization complexities in the implementation. See:

   http://wiki.apache.org/cassandra/MemtableSSTable

I believe sstables that are obsolete will not count towards load.

You can either trigger the GC, restart the cassandra nodes, or just
wait until they disappear (generating enough activity to trigger a CMS
sweep of the heap should be enough, assuming you use CMS).

--
/ Peter Schuller

Mime
View raw message