incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Schuller <peter.schul...@infidyne.com>
Subject Re: Cassandra disk space utilization WAY higher than I would expect
Date Wed, 18 Aug 2010 17:57:20 GMT
> I actually have the log files from all 8 nodes if it helps to diagnose what
> activity was going on behind the scenes.  I really need to understand how this
> happened.

Without necessarily dumping all the information - approximately what
do they contain? Do they contain anything about compactions,
anti-compactions, streaming, etc?

With an idle node after taking writes, I *think* the only expected
disk I/O (once it has settled) would be a memtable flush triggered by
memtable_flush_after_mins, and possibly compactions resulting from
that (depending on how close one were to triggering compaction prior
to the memtable flush). Whatever is causing additional sstables to be
written, even if somehow triggered incorrectly, I'd hope that they
were logged still.

What about something like a gossiping issue with some kind of
disagreement about token space? But even then, why would nodes
spontaneously start pushing data - my understanding is that this is
only triggered by administrative operations right now, which seems
confirmed by:

   http://wiki.apache.org/cassandra/Streaming

Assuming the log files contain some kind of activity such as
compaction/streaming/etc; do they correlate well in time with each
other and/or something else?

-- 
/ Peter Schuller

Mime
View raw message