hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anastasia Braginsky (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14921) Memory optimizations
Date Sun, 28 Feb 2016 23:03:18 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15171229#comment-15171229

Anastasia Braginsky commented on HBASE-14921:

I apologize for not explaining it well. 
In a try to clarify myself I wrote the attached paper. Not so long and with pictures :)
Maybe I am missing something, so please show me where my understanding is wrong.
I am answering here shortly, but please, please, please take also a look on the attached document.

bq. What will the serialization/format-transform look like (if any)?

I think there is no format-transform between CellBlocksSegment and HFile (if I understand
you correctly).
Flushing Snapshot to disk is done exactly the same as previously. Writing data from scanner
to sink (HFile via StoreFile).
But please look on “How CellBlocksSegment is transfered to HFile?" in the document.

bq. After that the Cell object is created and the reference to this Cell is inserted into
the skip-list to accelerate the search.
bq. Yes. This is a copy. Would be good if we did not have to do this.

Pay attention that CellBlocksSegments are created as result of the compaction process. This
is how we compact: we take a mix of “obsolete" cells and “updated” cells and copy to
another place the “updated” cells only. Then the memory holding the mix can be released.
Please look on “Why copies are needed in compacting process?” in the document.

bq. You've seen how we store blocks to hfiles with index blocks and blooms?

Yes. Maybe I am missing something, but it looks to me that this variant is not the best. When
using single-level index you lose the logarithmic access and when using multiple-level index
you get the logarithmic access but pay in memory overhead. This is also explained in the document.

> Memory optimizations
> --------------------
>                 Key: HBASE-14921
>                 URL: https://issues.apache.org/jira/browse/HBASE-14921
>             Project: HBase
>          Issue Type: Sub-task
>    Affects Versions: 2.0.0
>            Reporter: Eshcar Hillel
>         Attachments: CellBlocksSegmentInMemStore.pdf
> Memory optimizations including compressed format representation and offheap allocations

This message was sent by Atlassian JIRA

View raw message