hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17819) Reduce the heap overhead for BucketCache
Date Sun, 16 Jul 2017 07:53:02 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16088838#comment-16088838
] 

Anoop Sam John commented on HBASE-17819:
----------------------------------------

These are things trying out
1. We have 2 Enum refs in key and BucketEntry . Changing those to bytes types and just storing
the ordinal. We have few items only in the Enum and byte type is enough   
Result : Saving 6 bytes per entry
2. Changing the BucketEntry so that we have 2 classes for BucketEntries to IOEngine like File
mode and RAM backed IOEngine. Only in RAM backed, we need ref count way.  In file mode, we
will remove this state and markedForEvict.
Result : Saving 21 bytes per entry for File mode
3. Changing the refCount type from AtomicInteger to be a volatile int.  AtomicInteger object
and its refs in BucketEntry takes 20 bytes where was an int can work with 4 bytes.  On the
atomic increment/decrement, we will mimic what AtomicInteger is doing (Using unsafe CAS)
Result : Saving 16 bytes per entry for RAM backed IOEngine
4. Removing the CSLM for tracking per HFile blocks.   So for removing blocks when an HFile
is closed, we will have to iterate over all bucket entries and check for its HFile and then
remove. This is what we do in LRU cache. Considering this operation not happening in a hot
path, it is ok? We are doing this when CompactedHFilesDischarger runs (in 2 mns interval)
and remove all compacted away files.
Result : Saving 40 bytes per entry

> Reduce the heap overhead for BucketCache
> ----------------------------------------
>
>                 Key: HBASE-17819
>                 URL: https://issues.apache.org/jira/browse/HBASE-17819
>             Project: HBase
>          Issue Type: Sub-task
>          Components: BucketCache
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>             Fix For: 2.0.0
>
>
> We keep Bucket entry map in BucketCache.  Below is the math for heapSize for the key
, value into this map.
> BlockCacheKey
> ---------------
> String hfileName  -  Ref  - 4
> long offset  - 8
> BlockType blockType  - Ref  - 4
> boolean isPrimaryReplicaBlock  - 1
> Total  =  12 (Object) + 17 = 29
> BucketEntry
> ------------
> int offsetBase  -  4
> int length  - 4
> byte offset1  -  1
> byte deserialiserIndex  -  1
> long accessCounter  -  8
> BlockPriority priority  - Ref  - 4
> volatile boolean markedForEvict  -  1
> AtomicInteger refCount  -  16 + 4
> long cachedTime  -  8
> Total = 12 (Object) + 51 = 63
> ConcurrentHashMap Map.Entry  -  40
> blocksByHFile ConcurrentSkipListSet Entry  -  40
> Total = 29 + 63 + 80 = 172
> For 10 million blocks we will end up having 1.6GB of heap size.  
> This jira aims to reduce this as much as possible



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message