hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anastasia Braginsky (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16438) Create a cell type so that chunk id is embedded in it
Date Sun, 26 Mar 2017 13:27:42 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15942275#comment-15942275

Anastasia Braginsky commented on HBASE-16438:

Hi [~anoop.hbase], [~ram_krish], and [~carp84],

As I understand, in HBASE-16195, you are avoiding adding a chunk into MSLAB's list of chunks
in order to make the garbage collection (GC) faster. So when there is no more references to
this chunk from a SkipList/CellArayMap, the chunk's memory can be freed by GC. 

So now openScannerCount in MSLAB only serves for understanding when chunks can be returned
back to pool (if they were allocated from pool).
And why do we care for those chunks allocated from pool and don't care for those taken care
by GC (allocated by JVM)? The problem is as following:
When a Segment is removed (let's say due to flush to disk), it is already not referenced from
MemStore and the Segment is closed, following close of its MSLAB.
However, there are might be still ongoing scans accessing the chunks of this segment. Those
chunks cannot be de-allocated by GC because they have references from scan. But if we return
the chunks to pool, they can be reused and the memory corrupted under scan's hands.

Now when we introduce the ChunkCreator (keeping chunkID to chunk map) you are afraid that
we keep references to chunks for too long and delay the GC. I am saying all that so you can
check me whether I understand it all right. If I am wrong please correct me. If I am right,
then I have a suggestion for the following *simple* solution.

As Ram has suggested keep a boolean in chunk saying if from pool or not and... 
Just remove the chunkID to chunk mapping when segment is closed (for chunks that are not in

The scans (if they are still working) don't need the translation from chunk ID to chunk. This
translation is needed only for flattening/compaction when the segment is still alive. How
about that? :)

> Create a cell type so that chunk id is embedded in it
> -----------------------------------------------------
>                 Key: HBASE-16438
>                 URL: https://issues.apache.org/jira/browse/HBASE-16438
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 2.0.0
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-16438_1.patch, HBASE-16438_3_ChunkCreatorwrappingChunkPool.patch,
HBASE-16438_4_ChunkCreatorwrappingChunkPool.patch, HBASE-16438.patch, MemstoreChunkCell_memstoreChunkCreator_oldversion.patch,
> For CellChunkMap we may need a cell such that the chunk out of which it was created,
the id of the chunk be embedded in it so that when doing flattening we can use the chunk id
as a meta data. More details will follow once the initial tasks are completed. 
> Why we need to embed the chunkid in the Cell is described by [~anastas] in this remark
over in parent issue https://issues.apache.org/jira/browse/HBASE-14921?focusedCommentId=15244119&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15244119

This message was sent by Atlassian JIRA

View raw message