cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From SriSatish Ambati <srisatish.amb...@gmail.com>
Subject Re: Cassandra GC Settings
Date Mon, 17 Jan 2011 21:19:07 GMT
Thanks, Dan:

Yes, -Xmn512MB/1G sizes the Young Generation explicitly and removes the
adaptive resizing out of the picture. (If at all possible send your gc log
over & we can analyze the promotion failure a little bit more finely.)
The low load implies that that you are able to use the parallel threads
effectively.

cheers,
Sri

On Mon, Jan 17, 2011 at 9:05 PM, Dan Hendry <dan.hendry.junk@gmail.com>wrote:

> Thanks for all the info, I think I have been able to sort out my issue. The
> new settings I am using are:
>
> -Xmn512M (Very important I think)
> -XX:SurvivorRatio=5 (Not very important I think)
> -XX:MaxTenuringThreshold=5
> -XX:ParallelGCThreads=8
> -XX:CMSInitiatingOccupancyFraction=75
>
> Since applying these settings, the one time I saw the same type of behavior
> as before, the following appeared in the GC log.
>
>   Total time for which application threads were stopped: 0.6830080 seconds
>   1368.201: [GC 1368.201: [ParNew (promotion failed)
>   Desired survivor size 38338560 bytes, new threshold 1 (max 5)
>   - age   1:   55799736 bytes,   55799736 total
>   : 449408K->449408K(449408K), 0.2618690 secs]1368.463: [CMS1372.459:
> [CMS-concurrent-mark: 7.930/9.109 secs] [Times: us
>   er=28.31 sys=0.66, real=9.11 secs]
>    (concurrent mode failure): 9418431K->6267709K(11841536K), 26.4973750
> secs] 9777393K->6267709K(12290944K), [CMS Perm : 20477K->20443K(34188K)],
> 26.7595510 secs] [Times: user=31.75 sys=0.00, real=26.76 secs]
>   Total time for which application threads were stopped: 26.7617560 seconds
>
> Now, a full stop of the application was what I was seeing extensively
> before (100-200 times over the course of a major compaction as reported by
> gossipers on other nodes). I have also just noticed that the previous
> instability (ie application stops) correlated with the compaction of a few
> column families characterized by fairly fat rows (10 mb mean size, max sizes
> 150-200 mb, up to a million+ columns per row). My theory is that each row
> being compacted with the old settings was being promoted to the old
> generation, thereby running the heap out of space and causing a stop the
> world gc. With the new settings, rows being compacted typically remain in
> the young generation, allowing them to be cleaned up more quickly with less
> effort on the part of the garbage collector. Does this theory sound
> reasonable?
>
> Answering some of the other questions:
>
> > disk bound or CPU bound during compaction?
>
> ... Neither (?). Iowait is 10-20%, disk utilization rarely jumps above 60%,
> CPU %idle is about 60%. I would have said that I was memory bound but now, I
> think compaction is now bounded by being single threaded.
>
> > are you sure you're not swapping a bit?
>
> Only if JNA is not doing its job
>
> > Number of cores on your system. How busy is the system?
>
> 8, load factors typically < 4 so not terribly busy I would say.
>
> On Mon, Jan 17, 2011 at 12:58 PM, Peter Schuller <
> peter.schuller@infidyne.com> wrote:
>
>> > very quickly from the young generation to the old generation".
>> Furthermore,
>> > the CMSInitiatingOccupancyFraction of 75 (from a JVM default of 68)
>> means
>> > "start gc in the old generation later", presumably to allow Cassandra to
>> use
>> > more of the old generation heap without needlessly trying to free up
>> used
>> > space (?). Please correct me if I am misinterpreting these settings.
>>
>> Note the use of -XX:+UseCMSInitiatingOccupancyOnly which causes the
>> JVM to always trigger on that occupancy fraction rather than only do
>> it for the first trigger (or something along those lines) and then
>> switch to heuristics. Presumably (though I don't specifically know the
>> history of this particular option being added) it is more important to
>> avoid doing Full GC:s at all than super-optimally tweaking the trigger
>> for maximum throughput.
>>
>> The heuristics tend to cut it pretty close, and setting a conservative
>> fixed occupancy trigger probably greatly lessens the chance of falling
>> back to a full gc in production.
>>
>> > One of the issues I have been having is extreme node instability when
>> > running a major compaction. After 20-30 seconds of operation, the node
>> > spends 30+ seconds in (what I believe to be) GC. Now I have tried
>> halving
>> > all memtable thresholds to reduce overall heap memory usage but that has
>> not
>> > seemed to help with the instability. After one of these blips, I often
>> see
>> > log entries as follows:
>> >  INFO [ScheduledTasks:1] 2011-01-17 10:41:21,961 GCInspector.java (line
>> 133)
>> > GC for ParNew: 215 ms, 45084168 reclaimed leaving 11068700368 used; max
>> is
>> > 12783583232
>> >  INFO [ScheduledTasks:1] 2011-01-17 10:41:28,033 GCInspector.java (line
>> 133)
>> > GC for ParNew: 234 ms, 40401120 reclaimed leaving 12144504848 used; max
>> is
>> > 12783583232
>> >  INFO [ScheduledTasks:1] 2011-01-17 10:42:15,911 GCInspector.java (line
>> 133)
>> > GC for ConcurrentMarkSweep: 45828 ms, 3350764696 reclaimed leaving
>> > 9224048472 used; max is 12783583232
>>
>> 45 seconds is pretty significant even for a 12 gig heap unless you're
>> really CPU loaded so that there is heavy contention over the CPU.
>> While I don't see anything obviously extreme; are you sure you're not
>> swapping a bit?
>>
>> Also, what do you mean by node instability - does it *completely* stop
>> responding during these periods or does it flap in and out of the
>> cluster but is still responding?
>>
>> Are you nodes disk bound or CPU bound during compaction?
>>
>> --
>> / Peter Schuller
>>
>
>

Mime
View raw message