cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philo Yang <ud1...@gmail.com>
Subject full gc too often
Date Fri, 05 Dec 2014 04:13:29 GMT
Hi,all

I have a cluster on C* 2.1.1 and jdk 1.7_u51. I have a trouble with full gc
that sometime there may be one or two nodes full gc more than one time per
minute and over 10 seconds each time, then the node will be unreachable and
the latency of cluster will be increased.

I grep the GCInspector's log, I found when the node is running fine without
gc trouble there are two kinds of gc:
ParNew GC in less than 300ms which clear the Par Eden Space and enlarge CMS
Old Gen/ Par Survivor Space little (because it only show gc in more than
200ms, there is only a small number of ParNew GC in log)
ConcurrentMarkSweep in 4000~8000ms which reduce CMS Old Gen much and
enlarge Par Eden Space little, each 1-2 hours it will be executed once.

However, sometimes ConcurrentMarkSweep will be strange like it shows:

INFO  [Service Thread] 2014-12-05 11:28:44,629 GCInspector.java:142 -
ConcurrentMarkSweep GC in 12648ms.  CMS Old Gen: 3579838424 -> 3579838464;
Par Eden Space: 503316480 -> 294794576; Par Survivor Space: 62914528 -> 0
INFO  [Service Thread] 2014-12-05 11:28:59,581 GCInspector.java:142 -
ConcurrentMarkSweep GC in 12227ms.  CMS Old Gen: 3579838464 -> 3579836512;
Par Eden Space: 503316480 -> 310562032; Par Survivor Space: 62872496 -> 0
INFO  [Service Thread] 2014-12-05 11:29:14,686 GCInspector.java:142 -
ConcurrentMarkSweep GC in 11538ms.  CMS Old Gen: 3579836688 -> 3579805792;
Par Eden Space: 503316480 -> 332391096; Par Survivor Space: 62914544 -> 0
INFO  [Service Thread] 2014-12-05 11:29:29,371 GCInspector.java:142 -
ConcurrentMarkSweep GC in 12180ms.  CMS Old Gen: 3579835784 -> 3579829760;
Par Eden Space: 503316480 -> 351991456; Par Survivor Space: 62914552 -> 0
INFO  [Service Thread] 2014-12-05 11:29:45,028 GCInspector.java:142 -
ConcurrentMarkSweep GC in 10574ms.  CMS Old Gen: 3579838112 -> 3579799752;
Par Eden Space: 503316480 -> 366222584; Par Survivor Space: 62914560 -> 0
INFO  [Service Thread] 2014-12-05 11:29:59,546 GCInspector.java:142 -
ConcurrentMarkSweep GC in 11594ms.  CMS Old Gen: 3579831424 -> 3579817392;
Par Eden Space: 503316480 -> 388702928; Par Survivor Space: 62914552 -> 0
INFO  [Service Thread] 2014-12-05 11:30:14,153 GCInspector.java:142 -
ConcurrentMarkSweep GC in 11463ms.  CMS Old Gen: 3579817392 -> 3579838424;
Par Eden Space: 503316480 -> 408992784; Par Survivor Space: 62896720 -> 0
INFO  [Service Thread] 2014-12-05 11:30:25,009 GCInspector.java:142 -
ConcurrentMarkSweep GC in 9576ms.  CMS Old Gen: 3579838424 -> 3579816424;
Par Eden Space: 503316480 -> 438633608; Par Survivor Space: 62914544 -> 0
INFO  [Service Thread] 2014-12-05 11:30:39,929 GCInspector.java:142 -
ConcurrentMarkSweep GC in 11556ms.  CMS Old Gen: 3579816424 -> 3579785496;
Par Eden Space: 503316480 -> 441354856; Par Survivor Space: 62889528 -> 0
INFO  [Service Thread] 2014-12-05 11:30:54,085 GCInspector.java:142 -
ConcurrentMarkSweep GC in 12082ms.  CMS Old Gen: 3579786592 -> 3579814464;
Par Eden Space: 503316480 -> 448782440; Par Survivor Space: 62914560 -> 0

In each time Old Gen reduce only a little, Survivor Space will be clear but
the heap is still full so there will be another full gc very soon then the
node will down. If I restart the node, it will be fine without gc trouble.

Can anyone help me to find out where is the problem that full gc can't
reduce CMS Old Gen? Is it because there are too many objects in heap can't
be recycled? I think review the table scheme designing and add new nodes
into cluster is a good idea, but I still want to know if there is any other
reason causing this trouble.

Thanks,
Philo Yang

Mime
View raw message