cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thorsten von Eicken <>
Subject Re: Why so many SSTables?
Date Thu, 12 Apr 2012 23:40:57 GMT
>From my experience I would strongly advise against leveled compaction
for your use-case. But you should certainly test and see for yourself!
I have ~1TB on a node with ~13GB of heap. I ended up with 30k SSTables.
I raised the SSTable size to 100MB but that didn't prove to be
sufficient and I did it too late. I also had issues along the way where
I ran out of heap space (before going to 13GB) and the flushing that
then happens caused lots of small SSTables to be produced. Loading up
30k SSTables at start-up took 30+ minutes. With 30k SSTables I had 5
levels, if I remember correctly, that means everything goes through 5
compactions in the long run and the resulting cpu load is just
"phenomenal". I was using snappy compressor, maybe that exacerbated the
problem. At some point I overloaded thing with writes and had a
compaction backlog of over 3000. I stopped all data loading and after
several days it was still compacting away and not about to end. I
eventually switched to size tiered compaction and life is back to normal...
Of course, leveled compaction may work wonderfully for others, this is
just what I experienced. If I were to try it again, I'd watch my SSTable
count and abort the experiment if I end up with >10k and I'd watch my
compaction backlog and abort if it goes above 100.
- Thorsten

On 4/12/2012 12:54 AM, Romain HARDOUIN wrote:
> I've just opened a new JIRA: CASSANDRA-4142
> I've double checked numbers, 7747 seems to be array list object's
> capacity (Eclipse Memory Analyzer displays "java.lang.Object[7747] @
> 0x7d3f3f798").
> Actually there are 5757 browsable entries in EMA therefore each object
> is about 140 KB (size varies between 143088 bits and 143768 bits).
> We have no pending compactions tasks. Our cluster is currently
> under-loaded.
> Our goal is to handle hundreds of tera bytes which explains 1 TB per
> node. Our need is to archive data so our cluster is not running under
> a read-heavy load.
> @Dave Brosius: I made a mistake. To be correct 786 MB is 47% of *leak
> suspects* as reported by Eclipse Memory Analyser.
> Our Cassandra nodes are pretty standard: 10 GB of ram with Xmx set to
> 2,5 GB.
> Regards,
> Romain 

View raw message