incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wei Zhu <wz1...@yahoo.com>
Subject Re: "Heap is N.N full." Immediately on startup
Date Thu, 28 Feb 2013 20:27:21 GMT
Just read this blog:

http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2


It says:

1.2 moves the two biggest remaining culprits off-heap: compression metadata and per-row bloom
filters.
Compression metadata takes about 20GB of memory per TB of compressed data. Moving this into
native memory is especially important now that compression is enabled by default.

Are you using Compression for your CF? Compression metadata seems to be consuming a lot of
memory. 

-Wei


________________________________
 From: Andras Szerdahelyi <andras.szerdahelyi@ignitionone.com>
To: "user@cassandra.apache.org" <user@cassandra.apache.org> 
Sent: Thursday, February 28, 2013 12:19 PM
Subject: Re: "Heap is N.N full." Immediately on startup
 

INFO [ScheduledTasks:1] 2013-02-28 14:48:59,335 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
1315 ms for 1 collections, 6558662864 used; max is 8422162432
 INFO [ScheduledTasks:1] 2013-02-28 14:49:22,530 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
661 ms for 1 collections, 7302433920 used; max is 8422162432
 INFO [ScheduledTasks:1] 2013-02-28 14:49:43,595 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
820 ms for 1 collections, 7494556024 used; max is 8422162432
 INFO [ScheduledTasks:1] 2013-02-28 14:50:06,002 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
1057 ms for 1 collections, 7867906264 used; max is 8422162432
 INFO [ScheduledTasks:1] 2013-02-28 14:50:26,432 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
1025 ms for 1 collections, 7845991024 used; max is 8422162432
 INFO [ScheduledTasks:1] 2013-02-28 14:50:50,266 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
624 ms for 1 collections, 7148089096 used; max is 8422162432
 INFO [ScheduledTasks:1] 2013-02-28 14:51:12,549 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
617 ms for 1 collections, 7092289680 used; max is 8422162432
 INFO [ScheduledTasks:1] 2013-02-28 14:51:34,433 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
821 ms for 1 collections, 7454785104 used; max is 8422162432
 INFO [ScheduledTasks:1] 2013-02-28 14:51:58,054 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
1326 ms for 1 collections, 7767441672 used; max is 8422162432
 INFO [ScheduledTasks:1] 2013-02-28 14:52:22,980 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
1103 ms for 1 collections, 7649036600 used; max is 8422162432
 INFO [ScheduledTasks:1] 2013-02-28 14:52:42,887 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
877 ms for 1 collections, 7317935816 used; max is 8422162432

Does not take too long, but maybe it happens a bit too often ? I'll experiment with reducing
the bloom filter and/or increasing the heap ( maybe the newgen a bit )

Thanks for the tips!
Andras
From: aaron morton <aaron@thelastpickle.com>
Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Thursday 28 February 2013 19:43
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Subject: Re: "Heap is N.N full." Immediately on startup


- still getting 1-2 sstable reads with LCS
That's bang in the wheel househttp://www.datastax.com/dev/blog/when-to-use-leveled-compaction


However at startup I see a 5GB old gen ( that seems to be very stable at around 5.5GB under
moderate 90:10 read:write load - couple hundred q/s ) 
At 300 to 400 million rows that sounds roughly right. I've not done any calculations, but
I just spilt a coffee and checked the stain it made. 

The interesting thing is what happens at the end of a CMS run. How low does it get ?

 Also hinted handoff is off, as full GC pauses of 1-3seconds on nodes I have not yet attended
to 
I would look at the first. 
Search the user group for "correlate compaction gc" as a starting point. 

Cheers


-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 26/02/2013, at 3:22 AM, Andras Szerdahelyi <andras.szerdahelyi@ignitionone.com> wrote:

( I am monitoring via a visual vm plugin that shows generation sizes )
>I've increased the index sampling rate from 64 to 512 and the bloom filter fp ratio from
default to 0.1 ( and rebuilt stables ) - still getting 1-2 sstable reads with LCS
>However at startup I see a 5GB old gen ( that seems to be very stable at around 5.5GB
under moderate 90:10 read:write load - couple hundred q/s ) is that a realistic size with
these values and a ~200GB CF with 3-400mil rows? ( no secondary indexes, skinny rows 1-5K
each ) ? 
>
>
>Also hinted handoff is off, as full GC pauses of 1-3seconds on nodes I have not yet attended
to ( sampling rate at 64, bloom fp default ) are kicking off the failure detector even at
a phi_convict_threshold of 12 ( apparently ? ) and everything is in an up/down dance :-)
>
>
>Thanks!
>Andras
>
>From: aaron morton <aaron@thelastpickle.com>
>Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>Date: Friday 22 February 2013 17:44
>To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>Subject: Re: "Heap is N.N full." Immediately on startup
>
>
>
>To get a good idea of how GC is performing turn on the GC logging in cassandra-env.sh. 

>
>
>After a full cms GC event, see how big the tenured heap is. If it's not reducing enough
then GC will never get far enough ahead. 
>
>
>Cheers
>
>
>-----------------
>Aaron Morton
>Freelance Cassandra Developer
>New Zealand
>
>
>@aaronmorton
>http://www.thelastpickle.com
>
>On 22/02/2013, at 8:37 AM, Andras Szerdahelyi <andras.szerdahelyi@ignitionone.com>
wrote:
>
>Thank you- indeed my index interval is 64 with a CF of 300M rows + bloom filter false
positive chance was default.
>>Raising the index interval to 512 didn't fix this  alone, so I guess I'll have to
set the bloom filter to some reasonable value and scrub.
>>
>>From: aaron morton <aaron@thelastpickle.com>
>>Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>>Date: Thursday 21 February 2013 17:58
>>To: "user@cassandra.apache.org" <user@cassandra.apache.org>
>>Subject: Re: "Heap is N.N full." Immediately on startup
>>
>>
>>
>>My first guess would be the bloom filter and index sampling from lots-o-rows  
>>
>>
>>Check the row count in cfstats
>>Check the bloom filter size in cfstats. 
>>
>>
>>Background on memory requirements http://www.mail-archive.com/user@cassandra.apache.org/msg25762.html
>>
>>
>>Cheers
>>
>>
>>-----------------
>>Aaron Morton
>>Freelance Cassandra Developer
>>New Zealand
>>
>>
>>@aaronmorton
>>http://www.thelastpickle.com
>>
>>On 20/02/2013, at 11:27 PM, Andras Szerdahelyi <andras.szerdahelyi@ignitionone.com>
wrote:
>>
>>Hey list,
>>>
>>>
>>>Any ideas ( before I take a heap dump ) what might be consuming my 8GB JVM heap
at startup in Cassandra 1.1.6 besides
>>>	* row cache : not persisted and is at 0 keys when this warning is produced
>>>	* Memtables : no write traffic at startup, my app's column families are durable_writes:false
>>>	* Pending tasks : no pending tasks, except for 928 compactions ( not sure where
those are coming from )
>>>I drew these conclusions from the StatusLogger output below: 
>>>
>>>
>>>INFO [ScheduledTasks:1] 2013-02-20 05:13:25,198 GCInspector.java (line 122) GC
for ConcurrentMarkSweep: 14959 ms for 2 collections, 7017934560 used; max is 8375238656
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,198 StatusLogger.java (line 57)
Pool Name                    Active   Pending   Blocked
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,199 StatusLogger.java (line 72)
ReadStage                         0         0         0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,200 StatusLogger.java (line 72)
RequestResponseStage              0         0         0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,200 StatusLogger.java (line 72)
ReadRepairStage                   0         0         0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,200 StatusLogger.java (line 72)
MutationStage                     0        -1         0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,201 StatusLogger.java (line 72)
ReplicateOnWriteStage             0         0         0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,201 StatusLogger.java (line 72)
GossipStage                       0         0         0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,201 StatusLogger.java (line 72)
AntiEntropyStage                  0         0         0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,201 StatusLogger.java (line 72)
MigrationStage                    0         0         0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,201 StatusLogger.java (line 72)
StreamStage                       0         0         0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,202 StatusLogger.java (line 72)
MemtablePostFlusher               0         0         0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,202 StatusLogger.java (line 72)
FlushWriter                       0         0         0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,202 StatusLogger.java (line 72)
MiscStage                         0         0         0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,202 StatusLogger.java (line 72)
commitlog_archiver                0         0         0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,203 StatusLogger.java (line 72)
InternalResponseStage             0         0         0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,212 StatusLogger.java (line 77)
CompactionManager                 0       928
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,212 StatusLogger.java (line 89)
MessagingService                n/a       0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,212 StatusLogger.java (line 99)
Cache Type                     Size                 Capacity        
      KeysToSave                                                
        Provider
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,212 StatusLogger.java (line 100)
KeyCache                         25                       25    
                 all                                        
                        
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,213 StatusLogger.java (line 106)
RowCache                          0                        0    
                 all              org.apache.cassandra.cache.SerializingCacheProvider
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,213 StatusLogger.java (line 113)
ColumnFamily                Memtable ops,data
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,213 StatusLogger.java (line 116)
MYAPP_1.CF                        0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,213 StatusLogger.java (line 116) MYAPP_2.CF 
                       0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,213 StatusLogger.java (line 116)
HiveMetaStore.MetaStore                   0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,213 StatusLogger.java (line 116)
system.NodeIdInfo                         0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,213 StatusLogger.java (line 116)
system.IndexInfo                          0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,213 StatusLogger.java (line 116)
system.LocationInfo                       0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,214 StatusLogger.java (line 116)
system.Versions                           0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,214 StatusLogger.java (line 116)
system.schema_keyspaces                   0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,214 StatusLogger.java (line 116)
system.Migrations                         0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,214 StatusLogger.java (line 116)
system.schema_columnfamilies                 0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,214 StatusLogger.java (line 116)
system.schema_columns                     0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,214 StatusLogger.java (line 116)
system.HintsColumnFamily                  0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,214 StatusLogger.java (line 116)
system.Schema                             0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,214 StatusLogger.java (line 116)
cfs_archive.sblocks                       0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,214 StatusLogger.java (line 116)
cfs_archive.cleanup                       0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,215 StatusLogger.java (line 116)
cfs_archive.inode                         0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,215 StatusLogger.java (line 116)
cfs.cleanup                               0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,215 StatusLogger.java (line 116)
cfs.sblocks                               0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,215 StatusLogger.java (line 116)
cfs.inode                                 0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,215 StatusLogger.java (line 116)
OpsCenter.events                          0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,215 StatusLogger.java (line 116)
OpsCenter.rollups300                      0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,215 StatusLogger.java (line 116)
OpsCenter.rollups86400                    0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,215 StatusLogger.java (line 116)
OpsCenter.pdps                      345,54448
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,215 StatusLogger.java (line 116)
OpsCenter.events_timeline                 0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,216 StatusLogger.java (line 116)
OpsCenter.settings                        0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,216 StatusLogger.java (line 116)
OpsCenter.rollups7200                     0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,216 StatusLogger.java (line 116)
OpsCenter.rollups60                       0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,216 StatusLogger.java (line 116)
demo.users                                0,0
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,216 StatusLogger.java (line 116) MYAPP_3.CF 
                       0,0
>>> WARN [ScheduledTasks:1] 2013-02-20 05:13:25,216 GCInspector.java (line 145)
Heap is 0.8379384574280004 full.  You may need to reduce memtable and/or cache sizes.  Cassandra
will now flush up to the two largest memtables to free up memory.  Adjust flush_largest_memtables_at
threshold in cassandra.yaml if you don't want Cassandra to do this automatically
>>> WARN [ScheduledTasks:1] 2013-02-20 05:13:25,217 StorageService.java (line 2855)
Flushing CFS(Keyspace='OpsCenter', ColumnFamily='pdps') to relieve memory pressure
>>> INFO [ScheduledTasks:1] 2013-02-20 05:13:25,217 ColumnFamilyStore.java (line
659) Enqueuing flush of Memtable-pdps@98567689(15238/54448 serialized/live bytes, 345 ops)
>>> INFO [FlushWriter:6] 2013-02-20 05:13:25,218 Memtable.java (line 264) Writing
Memtable-pdps@98567689(15238/54448 serialized/live bytes, 345 ops)
>>> INFO [FlushWriter:6] 2013-02-20 05:13:25,241 Memtable.java (line 305) Completed
flushing /var/lib/cassandra/data/OpsCenter/pdps/OpsCenter-pdps-hf-127-Data.db (5394 bytes)
for commitlog position ReplayPosition(segmentId=1361354508149, :
>>>
>>>
>>>
>>>
>>>
>>>
>>>Thanks!
>>>Andras
>>>
>>>
>>
>
Mime
View raw message