cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shotaro Kamio <kamios...@gmail.com>
Subject Re: Cassandra memtable and GC
Date Mon, 22 Nov 2010 13:28:46 GMT
Hi Peter,

I've tested again with recording LiveSSTableCount and MemtableDataSize
via jmx. I guess this result supports my suspect on memtable
performance because I cannot find Full GC this time.
This is a result in smaller data size (160million records on
cassandra) on different disk configuration from my previous post. But
the general picture doesn't change.

The attached files:
- graph-read-throughput-diskT.png:  read throughput on my client program.
- graph-diskT-stat-with-jmx.png: graph of cpu load, LiveSSTableCount
and logarithm of MemtableDataSize.
- log-gc.20101122-12:41.160M.log.gz: GC log with -XX:+PrintGC
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps

As you can see from the second graph, logarithm of MemtableDataSize
and cpu load has a clear correlation. When a memtable is flushed and a
new SSTable is created (LiveSSTableCount is incremented), read
performance will be recovered. But it degrades soon.
I couldn't find Full GC in GC log in this test. So, I guess that this
performance is not a result of GC activity.


Regards,
Shotaro


On Sat, Nov 20, 2010 at 6:37 PM, Peter Schuller
<peter.schuller@infidyne.com> wrote:
>> After a memtable flush, you see minimum cpu and maximum read
>> throughput both in term of disk and cassandra records read.
>> As memtable increase in size, cpu goes up and read drops.
>> If this is because of memtable or GC performance issue, this is the
>> big question.
>>
>> As each memtable is just 128MB when flushed, I don't really expect GC
>> problem or caching issues.
>
> A memtable is basically just a ConcurrentSkipListMap. Unless you are
> somehow triggering some kind of degenerate casein the CSLM itself,
> which seems unlikely, the only common circumstance where filling the
> memtable should be resulting in a very significant performance drop
> should be if you're running really close to heap size and causing
> additional GC asymptotally as you're growing the memtable.
>
> But that doesn't seem to be the case. I don't know, maybe I missed
> something in your original post, but I'm not sure what to suggest that
> I haven't already without further information/hands-on
> experimentation/observation.
>
> But running with verbose GC as I mentioned should at least be a good
> start (-Xloggc:path/to/gclog
> -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimestamps).
>
> --
> / Peter Schuller
>



-- 
Shotaro Kamio

Mime
View raw message