hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HBASE-17819) Reduce the heap overhead for BucketCache
Date Fri, 03 Nov 2017 11:53:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16237495#comment-16237495
] 

Anoop Sam John edited comment on HBASE-17819 at 11/3/17 11:52 AM:
------------------------------------------------------------------

To let know the approach.  This is bit diff from V2 patch.  Major changes are
1. BucketEntry is extended to make the SharedMemory BucketEntry.  For file mode, there is
no need to keep the ref count as that is not shared memory type.  So I removed those new states
added for 11425 from BucketEntry.  For off heap mode BucketEntry, we have an extension now
where we have the new states.
2. Removed the CSLM for keeping the HFilename based blocks info.  The evictBlocksByHfileName
will have a perf impact as it has to iterate through all the entries to know each of the block
entry belong to this file or not.  For that changed the evictBlocksByHfileName to be an async
op way. A dedicated eviction thread will do this work.  ANy way even if we dont remove these
blocks or have delay in removal, eventually these block will get removed as we have LRU algo
for the eviction.  So when there are no space left for the new blocks addition, eviction would
happen, removing unused blocks.  More over, eviction of blocks on HFile close is default off
only (We have a config to turn this off).  When it is compaction , for the compacted files,
we have evictByHFiles happening now. There will be  bit more delay for the actual removal
of the blocks.   
But we save lot of heap memory per entry now as per this approach. The math is there in above
comment
{quote}
Now - 32 + 64 + 40 + 40 = 176
After patch - 32 + 48 + 40 = 120
Tested with Java Instrumentation
{quote}


was (Author: anoop.hbase):
To let know the approach.  This is bit diff from V2 patch.  Major changes are
1. BucketEntry is extended to make the SharedMemory BucketEntry.  For file mode, there is
no need to keep the ref count as that is not shared memory type.  So I removed those new states
added for 11425 from BucketEntry.  For off heap mode BucketEntry, we have an extension now
where we have the new states.
2. Removed the CSLM for keeping the HFilename based blocks info.  The evictBlocksByHfileName
will have a perf impact as it has to iterate through all the entries to know each of the block
entry belong to this file or not.  For that changed the evictBlocksByHfileName to be an async
op way. A dedicated eviction thread will do this work.  ANy way even if we dont remove these
blocks or have delay in removal, eventually these block will get removed as we have LRU algo
for the eviction.  So when there are no space left for the new blocks addition, eviction would
happen, removing unused blocks.  More over, eviction of blocks on HFile close is default off
only (We have a config to turn this off).  When it is compaction , for the compacted files,
we have evictByHFiles happening now. There will be  bit more delay for the actual removal
of the blocks.   
But we save lot of heap memory per entry now as per this approach. The math is there in above
comment

> Reduce the heap overhead for BucketCache
> ----------------------------------------
>
>                 Key: HBASE-17819
>                 URL: https://issues.apache.org/jira/browse/HBASE-17819
>             Project: HBase
>          Issue Type: Sub-task
>          Components: BucketCache
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>            Priority: Major
>             Fix For: 2.0.0
>
>         Attachments: HBASE-17819_V1.patch, HBASE-17819_V2.patch, HBASE-17819_V3.patch
>
>
> We keep Bucket entry map in BucketCache.  Below is the math for heapSize for the key
, value into this map.
> BlockCacheKey
> ---------------
> String hfileName  -  Ref  - 4
> long offset  - 8
> BlockType blockType  - Ref  - 4
> boolean isPrimaryReplicaBlock  - 1
> Total  =  12 (Object) + 17 = 29
> BucketEntry
> ------------
> int offsetBase  -  4
> int length  - 4
> byte offset1  -  1
> byte deserialiserIndex  -  1
> long accessCounter  -  8
> BlockPriority priority  - Ref  - 4
> volatile boolean markedForEvict  -  1
> AtomicInteger refCount  -  16 + 4
> long cachedTime  -  8
> Total = 12 (Object) + 51 = 63
> ConcurrentHashMap Map.Entry  -  40
> blocksByHFile ConcurrentSkipListSet Entry  -  40
> Total = 29 + 63 + 80 = 172
> For 10 million blocks we will end up having 1.6GB of heap size.  
> This jira aims to reduce this as much as possible



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message