incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Morton <aa...@thelastpickle.com>
Subject Re: Cass 2.0.0: Extensive memory allocation when row_cache enabled
Date Thu, 07 Nov 2013 04:18:48 GMT
> Class Name                                                                           
                      | Shallow Heap | Retained Heap
> -------------------------------------------------------------------------------------------------------------------------------------------
>                                                                                     
                       |              |              
> java.nio.HeapByteBuffer @ 0x7806a0848                                               
                       |           48 |            80
> '- name org.apache.cassandra.db.Column @ 0x7806424e8                                
                       |           32 |           112
>    |- [338530] java.lang.Object[540217] @ 0x57d62f560 Unreachable                   
                       |    2,160,888 |     2,160,888
>    |- [338530] java.lang.Object[810325] @ 0x591546540                               
                       |    3,241,320 |     7,820,328
>    |  '- elementData java.util.ArrayList @ 0x75e8424c0                              
                       |           24 |     7,820,352
>    |     |- list org.apache.cassandra.db.ArrayBackedSortedColumns$SlicesIterator @ 0x5940e0b18
             |           48 |           128
>    |     |  '- val$filteredIter org.apache.cassandra.db.filter.SliceQueryFilter$1 @ 0x5940e0b48
            |           32 |     7,820,568
>    |     |     '- val$iter org.apache.cassandra.db.filter.QueryFilter$2 @ 0x5940e0b68
Unreachable           |           24 |     7,820,592
>    |     |- this$0, parent java.util.ArrayList$SubList @ 0x5940e0bb8                
                       |           40 |            40
>    |     |  '- this$1 java.util.ArrayList$SubList$1 @ 0x5940e0be0                   
                       |           40 |            80
>    |     |     '- currentSlice org.apache.cassandra.db.ArrayBackedSortedColumns$SlicesIterator
@ 0x5940e0b18|           48 |           128
>    |     |        '- val$filteredIter org.apache.cassandra.db.filter.SliceQueryFilter$1
@ 0x5940e0b48       |           32 |     7,820,568
>    |     |           '- val$iter org.apache.cassandra.db.filter.QueryFilter$2 @ 0x5940e0b68
Unreachable     |           24 |     7,820,592
>    |     |- columns org.apache.cassandra.db.ArrayBackedSortedColumns @ 0x5b0a33488  
                       |           32 |            56
>    |     |  '- val$cf org.apache.cassandra.db.filter.SliceQueryFilter$1 @ 0x5940e0b48
                      |           32 |     7,820,568
>    |     |     '- val$iter org.apache.cassandra.db.filter.QueryFilter$2 @ 0x5940e0b68
Unreachable           |           24 |     7,820,592
>    |     '- Total: 3 entries                                                        
                       |              |              
>    |- [338530] java.lang.Object[360145] @ 0x7736ce2f0 Unreachable                   
                       |    1,440,600 |     1,440,600
>    '- Total: 3 entries                                                              
                       |              |              

Are you doing large slices or do could you have a lot of tombstones on the rows ? 

> We have disabled row cache on one node to see  the  difference. Please
> see attached plots from visual VM, I think that the effect is quite
> visible.
The default row cache is of the JVM heap, have you changed to the ConcurrentLinkedHashCacheProvider
? 

One way the SerializingCacheProvider could impact GC is if the CF takes a lot of writes. The
SerializingCacheProvider invalidates the row when it is written to and had to read the entire
row and serialise it on a cache miss.

>> -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms10G -Xmx10G
>> -Xmn1024M -XX:+HeapDumpOnOutOfMemoryError
You probably want the heap to be 4G to 8G in size, 10G will encounter longer pauses. 
Also the size of the new heap may be too big depending on the number of cores. I would recommend
trying 800M


> prg01.visual.vm.png
Shows the heap growing very quickly. This could be due to wide reads or a high write throughput.


Hope that helps. 

 

-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 7/11/2013, at 6:29 am, Chris Burroughs <chris.burroughs@gmail.com> wrote:

> Both caches involve several objects per entry (What do we want?  Packed objects.  When
do we want them? Now!).  The "size" is an estimate of the off heap values only and not the
total size nor number of entries.
> 
> An acceptable size will depend on your data and access patterns.  In one case we had
a cluster that at 512mb would go into a GC death spiral despite plenty of free heap (presumably
just due to the number of objects) while empirically the cluster runs smoothly at 384mb.
> 
> Your caches appear on the larger size, I suggest trying smaller values and only increase
when it produces measurable sustained gains.
> 
> On 11/05/2013 04:04 AM, Jiri Horky wrote:
>> Hi there,
>> 
>> we are seeing extensive memory allocation leading to quite long and
>> frequent GC pauses when using row cache. This is on cassandra 2.0.0
>> cluster with JNA 4.0 library with following settings:
>> 
>> key_cache_size_in_mb: 300
>> key_cache_save_period: 14400
>> row_cache_size_in_mb: 1024
>> row_cache_save_period: 14400
>> commitlog_sync: periodic
>> commitlog_sync_period_in_ms: 10000
>> commitlog_segment_size_in_mb: 32
>> 
>> -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms10G -Xmx10G
>> -Xmn1024M -XX:+HeapDumpOnOutOfMemoryError
>> -XX:HeapDumpPath=/data2/cassandra-work/instance-1/cassandra-1383566283-pid1893.hprof
>> -Xss180k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
>> -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8
>> -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75
>> -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB -XX:+UseCondCardMark
>> 
>> We have disabled row cache on one node to see  the  difference. Please
>> see attached plots from visual VM, I think that the effect is quite
>> visible. I have also taken 10x "jmap -histo" after 5s on a affected
>> server and plotted the result, attached as well.
>> 
>> I have taken a dump of the application when the heap size was 10GB, most
>> of the memory was unreachable, which was expected. The majority was used
>> by 55-59M objects of HeapByteBuffer, byte[] and
>> org.apache.cassandra.db.Column classes. I also include a list of inbound
>> references to the HeapByteBuffer objects from which it should be visible
>> where they are being allocated. This was acquired using Eclipse MAT.
>> 
>> Here is the comparison of GC times when row cache enabled and disabled:
>> 
>> prg01 - row cache enabled
>>       - uptime 20h45m
>>       - ConcurrentMarkSweep - 11494686ms
>>       - ParNew - 14690885 ms
>>       - time spent in GC: 35%
>> prg02 - row cache disabled
>>       - uptime 23h45m
>>       - ConcurrentMarkSweep - 251ms
>>       - ParNew - 230791 ms
>>       - time spent in GC: 0.27%
>> 
>> I would be grateful for any hints. Please let me know if you need any
>> further information. For now, we are going to disable the row cache.
>> 
>> Regards
>> Jiri Horky
>> 
> 


Mime
View raw message