incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Advice on memory warning
Date Fri, 26 Apr 2013 01:44:14 GMT
There have been a lot of discussions about GC tuning on the mail thread. Here's a really quick
set of guidelines I use, please search the mail archive if it does not answer your question.


If heavy GC activity correlates with cassandra compaction, do one or more of:
* reduce concurrent_compactions to 2 or 3
* reduce compaction_throughput
* reduce in_memory_compction_throughput

These are heavy handed changes designed to get things under control, you probably want to
remove some of the changes later. 

Enable GC logging in cassandra-env.sh and look at how much memory is in use after a full/CMS
compaction. If this is more than 50% of the heap you may end up doing a lot of GC. If you
have hundreds of millions of rows per node, on pre 1.2, reduce the bloom_fp_chance on the
CF's and index_sampling yaml config to reduce JVM memory use. 

If you have wide rows consider using (on 4 to 8 cores)
NEW_HEAP: 1000M
SurviviorRatio 4
MaxTenuringThreshold 4

Look at the tenuring distribution in the GC log to see how many ParNew passes objects make
it through. If you often see more objects  with tenuring 1 or 2 consider running with MaxTenuringThreshold
2. This can help reduce the amount of premature tenuring. 

GC problems are a combination of workload and configuration, and sometimes take a while to
sort out. 

Hope that helps 
 
-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 24/04/2013, at 11:53 PM, Michael Theroux <mtheroux2@yahoo.com> wrote:

> Hello,
> 
> Just to wrap up on my part of this thread, tuning CMS compaction threshold (-XX:CMSInitiatingOccupancyFraction)
to 70 appears to resolved my issues with the memory warnings.  However, I don't believe this
would be a solution to all the issues mentioned below.  Although, it does make sense to me
tune this value below the "flush_largest_memtables_at" value in cassandra.yaml so CMS compaction
will kick in before we start flushing memtables to free memory.
> 
> Thanks!
> -Mike
> 
> On Apr 23, 2013, at 12:47 PM, Haithem Jarraya wrote:
> 
>> We are facing similar issue, and we are not able to have the ring stable.  We are
using C*1.2.3 on Centos6, 32GB - RAM, 8GB-heap, 6 Nodes.
>> The total data ~ 84gb (which is relatively small for C* to handle, with a RF of 3).
 Our application is heavy read, we see the GC complaints in all nodes, I copied and past the
output below.
>> Also we usually see much larger values for the Pending - ReadStage, not sure what
is the best advice for this.
>> 
>> Thanks,
>> 
>> Haithem
>>  
>> INFO [ScheduledTasks:1] 2013-04-23 16:40:02,118 GCInspector.java (line 119) GC for
ConcurrentMarkSweep: 911 ms for 1 collections, 5945542968 used; max is 8199471104
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:16,051 GCInspector.java (line 119) GC for
ConcurrentMarkSweep: 322 ms for 1 collections, 5639896576 used; max is 8199471104
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,829 GCInspector.java (line 119) GC for
ConcurrentMarkSweep: 2273 ms for 1 collections, 6762618136 used; max is 8199471104
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,830 StatusLogger.java (line 53) Pool
Name                    Active   Pending   Blocked
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,830 StatusLogger.java (line 68) ReadStage
                        4         4         0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,831 StatusLogger.java (line 68) RequestResponseStage
             1         6         0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,831 StatusLogger.java (line 68) ReadRepairStage
                  0         0         0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,831 StatusLogger.java (line 68) MutationStage
                    0         0         0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,831 StatusLogger.java (line 68) ReplicateOnWriteStage
            0         0         0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,832 StatusLogger.java (line 68) GossipStage
                      0         0         0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,832 StatusLogger.java (line 68) AntiEntropyStage
                 0         0         0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,832 StatusLogger.java (line 68) MigrationStage
                   0         0         0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,832 StatusLogger.java (line 68) MemtablePostFlusher
              0         0         0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,833 StatusLogger.java (line 68) FlushWriter
                      0         0         0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,833 StatusLogger.java (line 68) MiscStage
                        0         0         0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,833 StatusLogger.java (line 68) commitlog_archiver
               0         0         0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,834 StatusLogger.java (line 68) InternalResponseStage
            0         0         0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,834 StatusLogger.java (line 68) AntiEntropySessions
              0         0         0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,834 StatusLogger.java (line 68) HintedHandoff
                    0         0         0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,843 StatusLogger.java (line 73) CompactionManager
                0         0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,844 StatusLogger.java (line 85) MessagingService
               n/a      15,1
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,844 StatusLogger.java (line 95) Cache
Type                     Size                 Capacity               KeysToSave          
                                              Provider
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,844 StatusLogger.java (line 96) KeyCache
                 251658064                251658081                      all     
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,844 StatusLogger.java (line 102) RowCache
                         0                        0                      all             
org.apache.cassandra.cache.SerializingCacheProvider
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,844 StatusLogger.java (line 109) ColumnFamily
               Memtable ops,data
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,845 StatusLogger.java (line 112) system.local
                             0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,845 StatusLogger.java (line 112) system.peers
                             0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,845 StatusLogger.java (line 112) system.batchlog
                          0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,845 StatusLogger.java (line 112) system.NodeIdInfo
                        0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,846 StatusLogger.java (line 112) system.LocationInfo
                      0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,846 StatusLogger.java (line 112) system.Schema
                            0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,846 StatusLogger.java (line 112) system.Migrations
                        0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,846 StatusLogger.java (line 112) system.schema_keyspaces
                  0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,846 StatusLogger.java (line 112) system.schema_columns
                    0,0
>> INFO [ScheduledTasks:1] 2013-04-23 16:40:30,846 StatusLogger.java (line 112) system.schema_columnfamilies
                0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,847 StatusLogger.java (line 112) system.IndexInfo
                         0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,847 StatusLogger.java (line 112) system.range_xfers
                       0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,847 StatusLogger.java (line 112) system.peer_events
                       0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,847 StatusLogger.java (line 112) system.hints
                             0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,847 StatusLogger.java (line 112) system.HintsColumnFamily
                 0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,848 StatusLogger.java (line 112) x.foo
                0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,848 StatusLogger.java (line 112) x.foo2
                0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,848 StatusLogger.java (line 112) x.foo3
                0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,848 StatusLogger.java (line 112) x.foo4
                0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,848 StatusLogger.java (line 112) x.foo5
                     0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,849 StatusLogger.java (line 112) x.foo6
                0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,849 StatusLogger.java (line 112) x.foo7
                    0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,849 StatusLogger.java (line 112) system_auth.users
                        0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,849 StatusLogger.java (line 112) system_traces.sessions
                   0,0
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,849 StatusLogger.java (line 112) system_traces.events
                     0,0
>>  WARN [ScheduledTasks:1] 2013-04-23 16:40:30,850 GCInspector.java (line 142) Heap
is 0.824762725573964 full.  You may need to reduce memtable and/or cache sizes.  Cassandra
will now flush up to the two largest memtables to free up memory.  Adjust flush_largest_memtables_at
threshold in cassandra.yaml if you don't want Cassandra to do this automatically
>>  INFO [ScheduledTasks:1] 2013-04-23 16:40:30,850 StorageService.java (line 3537)
Unable to reduce heap usage since there are no dirty column families
>> 
>> 
>> 
>> 
>> On 23 April 2013 16:52, Ralph Goers <ralph.goers@dslextreme.com> wrote:
>> We are using DSE, which I believe is also 1.1.9.  We have basically had a non-usable
cluster for months due to this error.  In our case, once it starts doing this it starts flushing
sstables to disk and eventually fills up the disk to the point where it can't compact.  If
we catch it soon enough and restart the node it usually can recover.
>> 
>> In our case, the heap size is 12 GB. As I understand it Cassandra will give 1/3 of
that for sstables. I then noticed that we have one column family that is using nearly 4GB
in bloom filters on each node.  Since the nodes will start doing this when the heap reaches
9GB we essentially only have 1GB of free memory so when compactions, cleanups, etc take place
this situation starts happening.  We are working to change our data model to try to resolve
this.
>> 
>> Ralph
>> 
>> On Apr 19, 2013, at 8:00 AM, Michael Theroux wrote:
>> 
>> > Hello,
>> >
>> > We've recently upgraded from m1.large to m1.xlarge instances on AWS to handle
additional load, but to also relieve memory pressure.  It appears to have accomplished both,
however, we are still getting a warning, 0-3 times a day, on our database nodes:
>> >
>> > WARN [ScheduledTasks:1] 2013-04-19 14:17:46,532 GCInspector.java (line 145)
Heap is 0.7529240824406468 full.  You may need to reduce memtable and/or cache sizes.  Cassandra
will now flush up to the two largest memtables to free up memory.  Adjust flush_largest_memtables_at
threshold in cassandra.yaml if you don't want Cassandra to do this automatically
>> >
>> > This is happening much less frequently than before the upgrade, but after essentially
doubling the amount of available memory, I'm curious on what I can do to determine what is
happening during this time.
>> >
>> > I am collecting all the JMX statistics.  Memtable space is elevated but not
extraordinarily high.  No GC messages are being output to the log.
>> >
>> > These warnings do seem to be occurring doing compactions of column families
using LCS with wide rows, but I'm not sure there is a direct correlation.
>> >
>> > We are running Cassandra 1.1.9, with a maximum heap of 8G.
>> >
>> > Any advice?
>> > Thanks,
>> > -Mike
>> 
>> 
> 


Mime
View raw message