cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: GC freeze just after repair session
Date Fri, 06 Jul 2012 18:08:11 GMT
> I was thinking of decreasing concurrent_compactors and in_memory_compaction_limit to go
easy on GC
I've used that technique to reduce gc pressure during compactions before. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/07/2012, at 6:46 PM, Ravikumar Govindarajan wrote:

> Our Young size=800 MB,SurvivorRatio=8,edenSize=640MB. All objects/bytes generated during
compaction are garbage right?
> 
> During compaction, with in_memory_compaction_limit=64MB and concurrent_compactors=8,
 there is a lot of pressure on ParNew sweeps.
> 
> I was thinking of decreasing concurrent_compactors and in_memory_compaction_limit to
go easy on GC
> 
>  I am not familiar with inner workings of cassandra but hope have diagnosed the problem
to a little extent.
> 
> On Fri, Jul 6, 2012 at 11:27 AM, rohit bhatia <rohit2412@gmail.com> wrote:
> @ravi, u can increase young gen size, keep a high tenuring rate or
> increase survivor ratio..
> 
> 
> On Fri, Jul 6, 2012 at 4:03 AM, aaron morton <aaron@thelastpickle.com> wrote:
> > Ideally we would like to collect maximum garbage from ParNew itself, during
> > compactions. What are the steps to take towards to achieving this?
> >
> > I'm not sure what you are asking.
> >
> > Cheers
> >
> > -----------------
> > Aaron Morton
> > Freelance Developer
> > @aaronmorton
> > http://www.thelastpickle.com
> >
> > On 5/07/2012, at 6:56 PM, Ravikumar Govindarajan wrote:
> >
> > We have modified maxTenuringThreshold from 1 to 5. May be it is causing
> > problems. Will change it back to 1 and see how the system is.
> >
> > concurrent_compactors=8. We will reduce this, as anyway our system won't be
> > able to handle this number of compactions at the same time. Think it will
> > ease GC also to some extent.
> >
> > Ideally we would like to collect maximum garbage from ParNew itself, during
> > compactions. What are the steps to take towards to achieving this?
> >
> > On Wed, Jul 4, 2012 at 4:07 PM, aaron morton <aaron@thelastpickle.com>
> > wrote:
> >>
> >> It *may* have been compaction from the repair, but it's not a big CF.
> >>
> >> I would look at the logs to see how much data was transferred to the node.
> >> Was their a compaction going on while the GC storm was happening ? Do you
> >> have a lot of secondary indexes ?
> >>
> >> If you think it correlated to compaction you can try reducing the
> >> concurrent_compactors
> >>
> >> Cheers
> >>
> >> -----------------
> >> Aaron Morton
> >> Freelance Developer
> >> @aaronmorton
> >> http://www.thelastpickle.com
> >>
> >> On 3/07/2012, at 6:33 PM, Ravikumar Govindarajan wrote:
> >>
> >> Recently, we faced a severe freeze [around 30-40 mins] on one of our
> >> servers. There were many mutations/reads dropped. The issue happened just
> >> after a routine nodetool repair for the below CF completed [1.0.7, NTS,
> >> DC1:3,DC2:2]
> >>
> >> Column Family: MsgIrtConv
> >> SSTable count: 12
> >> Space used (live): 17426379140
> >> Space used (total): 17426379140
> >> Number of Keys (estimate): 122624
> >> Memtable Columns Count: 31180
> >> Memtable Data Size: 81950175
> >> Memtable Switch Count: 31
> >> Read Count: 8074156
> >> Read Latency: 15.743 ms.
> >> Write Count: 2172404
> >> Write Latency: 0.037 ms.
> >> Pending Tasks: 0
> >> Bloom Filter False Postives: 1258
> >> Bloom Filter False Ratio: 0.03598
> >> Bloom Filter Space Used: 498672
> >> Key cache capacity: 200000
> >> Key cache size: 200000
> >> Key cache hit rate: 0.9965579513062582
> >> Row cache: disabled
> >> Compacted row minimum size: 51
> >> Compacted row maximum size: 89970660
> >> Compacted row mean size: 226626
> >>
> >>
> >> Our heap config is as follows
> >>
> >> -Xms8G -Xmx8G -Xmn800M -XX:+HeapDumpOnOutOfMemoryError -XX:+UseParNewGC
> >> -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8
> >> -XX:MaxTenuringThreshold=5 -XX:CMSInitiatingOccupancyFraction=75
> >> -XX:+UseCMSInitiatingOccupancyOnly
> >>
> >> from yaml
> >> in_memory_compaction_limit=64
> >> compaction_throughput_mb_sec=8
> >> multi_threaded_compaction=false
> >>
> >>  INFO [AntiEntropyStage:1] 2012-06-29 09:21:26,085 AntiEntropyService.java
> >> (line 762) [repair #2b6fcbf0-c1f9-11e1-0000-2ea8811bfbff] MsgIrtConv is
> >> fully synced
> >>  INFO [AntiEntropySessions:8] 2012-06-29 09:21:26,085
> >> AntiEntropyService.java (line 698) [repair
> >> #2b6fcbf0-c1f9-11e1-0000-2ea8811bfbff] session completed successfully
> >>  INFO [CompactionExecutor:857] 2012-06-29 09:21:31,219 CompactionTask.java
> >> (line 221) Compacted to
> >> [/home/sas/system/data/ZMail/MsgIrtConv-hc-858-Data.db,].  47,907,012 to
> >> 40,554,059 (~84% of original) bytes for 4,564 keys at 6.252080MB/s.  Time:
> >> 6,186ms.
> >>
> >> After this, the logs were fully filled with GC [ParNew/CMS]. ParNew ran
> >> for every 3 seconds, while CMS ran for every 30 seconds approx continuous
> >> for 40 minutes.
> >>
> >>  INFO [ScheduledTasks:1] 2012-06-29 09:23:39,921 GCInspector.java (line
> >> 122) GC for ParNew: 776 ms for 2 collections, 2901990208 used; max is
> >> 8506048512
> >>  INFO [ScheduledTasks:1] 2012-06-29 09:23:42,265 GCInspector.java (line
> >> 122) GC for ParNew: 2028 ms for 2 collections, 3831282056 used; max is
> >> 8506048512
> >>
> >> .........................................
> >>
> >>  INFO [ScheduledTasks:1] 2012-06-29 10:07:53,884 GCInspector.java (line
> >> 122) GC for ParNew: 817 ms for 2 collections, 2808685768 used; max is
> >> 8506048512
> >>  INFO [ScheduledTasks:1] 2012-06-29 10:07:55,632 GCInspector.java (line
> >> 122) GC for ParNew: 1165 ms for 3 collections, 3264696776 used; max is
> >> 8506048512
> >>  INFO [ScheduledTasks:1] 2012-06-29 10:07:57,773 GCInspector.java (line
> >> 122) GC for ParNew: 1444 ms for 3 collections, 4234372296 used; max is
> >> 8506048512
> >>  INFO [ScheduledTasks:1] 2012-06-29 10:07:59,387 GCInspector.java (line
> >> 122) GC for ParNew: 1153 ms for 2 collections, 4910279080 used; max is
> >> 8506048512
> >>  INFO [ScheduledTasks:1] 2012-06-29 10:08:00,389 GCInspector.java (line
> >> 122) GC for ParNew: 697 ms for 2 collections, 4873857072 used; max is
> >> 8506048512
> >>  INFO [ScheduledTasks:1] 2012-06-29 10:08:01,443 GCInspector.java (line
> >> 122) GC for ParNew: 726 ms for 2 collections, 4941511184 used; max is
> >> 8506048512
> >>
> >> After this, the node got stable and was back and running. Any pointers
> >> will be greatly helpful
> >>
> >>
> >
> >
> 


Mime
View raw message