incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andras Szerdahelyi <andras.szerdahe...@ignitionone.com>
Subject Re: "Heap is N.N full." Immediately on startup
Date Thu, 28 Feb 2013 21:03:06 GMT
Wei,

I am using compression. Without it, I'd be spending all my coin on disks I'm afraid. But thanks
for the tip, I didn't realise it was taking just that much. I'll try and contrast it with
disk costs and read performance.

Regards,
andras

From: Wei Zhu <wz1975@yahoo.com<mailto:wz1975@yahoo.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>,
Wei Zhu <wz1975@yahoo.com<mailto:wz1975@yahoo.com>>
Date: Thursday 28 February 2013 21:27
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: "Heap is N.N full." Immediately on startup

Just read this blog:

http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2

It says:

1.2 moves the two biggest remaining culprits off-heap: compression metadata and per-row bloom
filters.
Compression metadata<http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-compression>
takes about 20GB of memory per TB of compressed data. Moving this into native memory is especially
important now that compression is enabled by default.

Are you using Compression for your CF? Compression metadata seems to be consuming a lot of
memory.

-Wei

________________________________
From: Andras Szerdahelyi <andras.szerdahelyi@ignitionone.com<mailto:andras.szerdahelyi@ignitionone.com>>
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Sent: Thursday, February 28, 2013 12:19 PM
Subject: Re: "Heap is N.N full." Immediately on startup

INFO [ScheduledTasks:1] 2013-02-28 14:48:59,335 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
1315 ms for 1 collections, 6558662864 used; max is 8422162432
 INFO [ScheduledTasks:1] 2013-02-28 14:49:22,530 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
661 ms for 1 collections, 7302433920 used; max is 8422162432
 INFO [ScheduledTasks:1] 2013-02-28 14:49:43,595 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
820 ms for 1 collections, 7494556024 used; max is 8422162432
 INFO [ScheduledTasks:1] 2013-02-28 14:50:06,002 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
1057 ms for 1 collections, 7867906264 used; max is 8422162432
 INFO [ScheduledTasks:1] 2013-02-28 14:50:26,432 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
1025 ms for 1 collections, 7845991024 used; max is 8422162432
 INFO [ScheduledTasks:1] 2013-02-28 14:50:50,266 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
624 ms for 1 collections, 7148089096 used; max is 8422162432
 INFO [ScheduledTasks:1] 2013-02-28 14:51:12,549 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
617 ms for 1 collections, 7092289680 used; max is 8422162432
 INFO [ScheduledTasks:1] 2013-02-28 14:51:34,433 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
821 ms for 1 collections, 7454785104 used; max is 8422162432
 INFO [ScheduledTasks:1] 2013-02-28 14:51:58,054 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
1326 ms for 1 collections, 7767441672 used; max is 8422162432
 INFO [ScheduledTasks:1] 2013-02-28 14:52:22,980 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
1103 ms for 1 collections, 7649036600 used; max is 8422162432
 INFO [ScheduledTasks:1] 2013-02-28 14:52:42,887 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
877 ms for 1 collections, 7317935816 used; max is 8422162432

Does not take too long, but maybe it happens a bit too often ? I'll experiment with reducing
the bloom filter and/or increasing the heap ( maybe the newgen a bit )

Thanks for the tips!
Andras

From: aaron morton <aaron@thelastpickle.com<mailto:aaron@thelastpickle.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Thursday 28 February 2013 19:43
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: "Heap is N.N full." Immediately on startup

- still getting 1-2 sstable reads with LCS
That's bang in the wheel house
http://www.datastax.com/dev/blog/when-to-use-leveled-compaction

However at startup I see a 5GB old gen ( that seems to be very stable at around 5.5GB under
moderate 90:10 read:write load - couple hundred q/s )
At 300 to 400 million rows that sounds roughly right. I've not done any calculations, but
I just spilt a coffee and checked the stain it made.

The interesting thing is what happens at the end of a CMS run. How low does it get ?

 Also hinted handoff is off, as full GC pauses of 1-3seconds on nodes I have not yet attended
to
I would look at the first.
Search the user group for "correlate compaction gc" as a starting point.

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com<http://www.thelastpickle.com/>

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com<http://www.thelastpickle.com/>

On 26/02/2013, at 3:22 AM, Andras Szerdahelyi <andras.szerdahelyi@ignitionone.com<mailto:andras.szerdahelyi@ignitionone.com>>
wrote:

( I am monitoring via a visual vm plugin that shows generation sizes )
I've increased the index sampling rate from 64 to 512 and the bloom filter fp ratio from default
to 0.1 ( and rebuilt stables ) - still getting 1-2 sstable reads with LCS
However at startup I see a 5GB old gen ( that seems to be very stable at around 5.5GB under
moderate 90:10 read:write load - couple hundred q/s ) is that a realistic size with these
values and a ~200GB CF with 3-400mil rows? ( no secondary indexes, skinny rows 1-5K each )
?

Also hinted handoff is off, as full GC pauses of 1-3seconds on nodes I have not yet attended
to ( sampling rate at 64, bloom fp default ) are kicking off the failure detector even at
a phi_convict_threshold of 12 ( apparently ? ) and everything is in an up/down dance :-)

Thanks!
Andras

From: aaron morton <aaron@thelastpickle.com<mailto:aaron@thelastpickle.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Friday 22 February 2013 17:44
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: "Heap is N.N full." Immediately on startup

To get a good idea of how GC is performing turn on the GC logging in cassandra-env.sh.

After a full cms GC event, see how big the tenured heap is. If it's not reducing enough then
GC will never get far enough ahead.

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com<http://www.thelastpickle.com/>

On 22/02/2013, at 8:37 AM, Andras Szerdahelyi <andras.szerdahelyi@ignitionone.com<mailto:andras.szerdahelyi@ignitionone.com>>
wrote:

Thank you- indeed my index interval is 64 with a CF of 300M rows + bloom filter false positive
chance was default.
Raising the index interval to 512 didn't fix this  alone, so I guess I'll have to set the
bloom filter to some reasonable value and scrub.

From: aaron morton <aaron@thelastpickle.com<mailto:aaron@thelastpickle.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Thursday 21 February 2013 17:58
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: "Heap is N.N full." Immediately on startup

My first guess would be the bloom filter and index sampling from lots-o-rows

Check the row count in cfstats
Check the bloom filter size in cfstats.

Background on memory requirements http://www.mail-archive.com/user@cassandra.apache.org/msg25762.html

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com<http://www.thelastpickle.com/>

On 20/02/2013, at 11:27 PM, Andras Szerdahelyi <andras.szerdahelyi@ignitionone.com<mailto:andras.szerdahelyi@ignitionone.com>>
wrote:

Hey list,

Any ideas ( before I take a heap dump ) what might be consuming my 8GB JVM heap at startup
in Cassandra 1.1.6 besides

  *   row cache : not persisted and is at 0 keys when this warning is produced
  *   Memtables : no write traffic at startup, my app's column families are durable_writes:false
  *   Pending tasks : no pending tasks, except for 928 compactions ( not sure where those
are coming from )

I drew these conclusions from the StatusLogger output below:

INFO [ScheduledTasks:1] 2013-02-20 05:13:25,198 GCInspector.java (line 122) GC for ConcurrentMarkSweep:
14959 ms for 2 collections, 7017934560 used; max is 8375238656
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,198 StatusLogger.java (line 57) Pool Name   
                Active   Pending   Blocked
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,199 StatusLogger.java (line 72) ReadStage   
                     0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,200 StatusLogger.java (line 72) RequestResponseStage
             0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,200 StatusLogger.java (line 72) ReadRepairStage
                  0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,200 StatusLogger.java (line 72) MutationStage
                    0        -1         0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,201 StatusLogger.java (line 72) ReplicateOnWriteStage
            0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,201 StatusLogger.java (line 72) GossipStage 
                     0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,201 StatusLogger.java (line 72) AntiEntropyStage
                 0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,201 StatusLogger.java (line 72) MigrationStage
                   0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,201 StatusLogger.java (line 72) StreamStage 
                     0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,202 StatusLogger.java (line 72) MemtablePostFlusher
              0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,202 StatusLogger.java (line 72) FlushWriter 
                     0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,202 StatusLogger.java (line 72) MiscStage   
                     0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,202 StatusLogger.java (line 72) commitlog_archiver
               0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,203 StatusLogger.java (line 72) InternalResponseStage
            0         0         0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,212 StatusLogger.java (line 77) CompactionManager
                0       928
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,212 StatusLogger.java (line 89) MessagingService
               n/a       0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,212 StatusLogger.java (line 99) Cache Type  
                  Size                 Capacity               KeysToSave                 
                                       Provider
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,212 StatusLogger.java (line 100) KeyCache   
                     25                       25                      all
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,213 StatusLogger.java (line 106) RowCache   
                      0                        0                      all              org.apache.cassandra.cache.SerializingCacheProvider
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,213 StatusLogger.java (line 113) ColumnFamily
               Memtable ops,data
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,213 StatusLogger.java (line 116) MYAPP_1.CF 
                      0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,213 StatusLogger.java (line 116) MYAPP_2.CF 
                       0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,213 StatusLogger.java (line 116) HiveMetaStore.MetaStore
                  0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,213 StatusLogger.java (line 116) system.NodeIdInfo
                        0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,213 StatusLogger.java (line 116) system.IndexInfo
                         0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,213 StatusLogger.java (line 116) system.LocationInfo
                      0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,214 StatusLogger.java (line 116) system.Versions
                          0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,214 StatusLogger.java (line 116) system.schema_keyspaces
                  0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,214 StatusLogger.java (line 116) system.Migrations
                        0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,214 StatusLogger.java (line 116) system.schema_columnfamilies
                0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,214 StatusLogger.java (line 116) system.schema_columns
                    0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,214 StatusLogger.java (line 116) system.HintsColumnFamily
                 0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,214 StatusLogger.java (line 116) system.Schema
                            0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,214 StatusLogger.java (line 116) cfs_archive.sblocks
                      0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,214 StatusLogger.java (line 116) cfs_archive.cleanup
                      0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,215 StatusLogger.java (line 116) cfs_archive.inode
                        0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,215 StatusLogger.java (line 116) cfs.cleanup
                              0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,215 StatusLogger.java (line 116) cfs.sblocks
                              0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,215 StatusLogger.java (line 116) cfs.inode  
                              0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,215 StatusLogger.java (line 116) OpsCenter.events
                         0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,215 StatusLogger.java (line 116) OpsCenter.rollups300
                     0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,215 StatusLogger.java (line 116) OpsCenter.rollups86400
                   0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,215 StatusLogger.java (line 116) OpsCenter.pdps
                     345,54448
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,215 StatusLogger.java (line 116) OpsCenter.events_timeline
                0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,216 StatusLogger.java (line 116) OpsCenter.settings
                       0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,216 StatusLogger.java (line 116) OpsCenter.rollups7200
                    0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,216 StatusLogger.java (line 116) OpsCenter.rollups60
                      0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,216 StatusLogger.java (line 116) demo.users 
                              0,0
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,216 StatusLogger.java (line 116) MYAPP_3.CF 
                       0,0
 WARN [ScheduledTasks:1] 2013-02-20 05:13:25,216 GCInspector.java (line 145) Heap is 0.8379384574280004
full.  You may need to reduce memtable and/or cache sizes.  Cassandra will now flush up to
the two largest memtables to free up memory.  Adjust flush_largest_memtables_at threshold
in cassandra.yaml if you don't want Cassandra to do this automatically
 WARN [ScheduledTasks:1] 2013-02-20 05:13:25,217 StorageService.java (line 2855) Flushing
CFS(Keyspace='OpsCenter', ColumnFamily='pdps') to relieve memory pressure
 INFO [ScheduledTasks:1] 2013-02-20 05:13:25,217 ColumnFamilyStore.java (line 659) Enqueuing
flush of Memtable-pdps@98567689(15238/54448 serialized/live bytes, 345 ops)
 INFO [FlushWriter:6] 2013-02-20 05:13:25,218 Memtable.java (line 264) Writing Memtable-pdps@98567689(15238/54448
serialized/live bytes, 345 ops)
 INFO [FlushWriter:6] 2013-02-20 05:13:25,241 Memtable.java (line 305) Completed flushing
/var/lib/cassandra/data/OpsCenter/pdps/OpsCenter-pdps-hf-127-Data.db (5394 bytes) for commitlog
position ReplayPosition(segmentId=1361354508149, :



Thanks!
Andras







Mime
View raw message