hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16438) Create a cell type so that chunk id is embedded in it
Date Sun, 26 Mar 2017 15:58:42 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15942323#comment-15942323

Anoop Sam John commented on HBASE-16438:

The problem in HBASE-16195 is this.
Say we have MSLAB in place and it is not pooled at all.
The data is duplicated so much. Means one cell is added and again same more cells might come
in.  When we add a cell to CSLM, the existing same one will get removed right. U can see the
logic around sizing and all.
Now say a chunk is having 200 bytes size. cell1 is added which is of size 100 and cell2 again
with 100 bytes size.  Now that  chunk is full and we get another chunk of 200 bytes size.
 Again cell1 and cell2 comes in.  CSLM will remove old added cell1 and cell2.  Means there
are no refs to chunk1 from any cells..  (Forget abt concurrent scanners. In this case assume
no such concurrent scan at all)..  chunk2 only active now.  Ideally GC can reclaim chunk1.
  Now if we keep a ref to chunks (Any place other than from cells), chunk can not be reclaimed.
 This was the issue.  Hope it is clear now.
As long as we have even one active cell within a chunk (The cell object is gone as it is converted
to ChunkMap), we need its mapping.  Any time a scan can refer to this cell.  
How u plan to support ChunkMap feature?  I mean how to pass the info whether the CSLM has
to be converted to CellArrayMap or CellChunkMap.  When CellChunkMap is not in use, may be
we dont need to keep this id vs chunk map at all..  How abt we enable this feature iff MSLAB
pool is in place? Just asking

> Create a cell type so that chunk id is embedded in it
> -----------------------------------------------------
>                 Key: HBASE-16438
>                 URL: https://issues.apache.org/jira/browse/HBASE-16438
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 2.0.0
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-16438_1.patch, HBASE-16438_3_ChunkCreatorwrappingChunkPool.patch,
HBASE-16438_4_ChunkCreatorwrappingChunkPool.patch, HBASE-16438.patch, MemstoreChunkCell_memstoreChunkCreator_oldversion.patch,
> For CellChunkMap we may need a cell such that the chunk out of which it was created,
the id of the chunk be embedded in it so that when doing flattening we can use the chunk id
as a meta data. More details will follow once the initial tasks are completed. 
> Why we need to embed the chunkid in the Cell is described by [~anastas] in this remark
over in parent issue https://issues.apache.org/jira/browse/HBASE-14921?focusedCommentId=15244119&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15244119

This message was sent by Atlassian JIRA

View raw message