Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 90144180CA for ; Sun, 28 Feb 2016 23:03:18 +0000 (UTC) Received: (qmail 72640 invoked by uid 500); 28 Feb 2016 23:03:18 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 72601 invoked by uid 500); 28 Feb 2016 23:03:18 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 72578 invoked by uid 99); 28 Feb 2016 23:03:18 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 28 Feb 2016 23:03:18 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 2F4DB2C1F55 for ; Sun, 28 Feb 2016 23:03:18 +0000 (UTC) Date: Sun, 28 Feb 2016 23:03:18 +0000 (UTC) From: "Anastasia Braginsky (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-14921) Memory optimizations MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-14921?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D151= 71229#comment-15171229 ]=20 Anastasia Braginsky commented on HBASE-14921: --------------------------------------------- I apologize for not explaining it well.=20 In a try to clarify myself I wrote the attached paper. Not so long and with= pictures :) Maybe I am missing something, so please show me where my understanding is w= rong. I am answering here shortly, but please, please, please take also a look on= the attached document. bq. What will the serialization/format-transform look like (if any)? I think there is no format-transform between CellBlocksSegment and HFile (i= f I understand you correctly). Flushing Snapshot to disk is done exactly the same as previously. Writing d= ata from scanner to sink (HFile via StoreFile). But please look on =E2=80=9CHow CellBlocksSegment is transfered to HFile?" = in the document. bq. After that the Cell object is created and the reference to this Cell is= inserted into the skip-list to accelerate the search. bq. Yes. This is a copy. Would be good if we did not have to do this. Pay attention that CellBlocksSegments are created as result of the compacti= on process. This is how we compact: we take a mix of =E2=80=9Cobsolete" cel= ls and =E2=80=9Cupdated=E2=80=9D cells and copy to another place the =E2=80= =9Cupdated=E2=80=9D cells only. Then the memory holding the mix can be rele= ased. Please look on =E2=80=9CWhy copies are needed in compacting process?= =E2=80=9D in the document. bq. You've seen how we store blocks to hfiles with index blocks and blooms? Yes. Maybe I am missing something, but it looks to me that this variant is = not the best. When using single-level index you lose the logarithmic access= and when using multiple-level index you get the logarithmic access but pay= in memory overhead. This is also explained in the document. > Memory optimizations > -------------------- > > Key: HBASE-14921 > URL: https://issues.apache.org/jira/browse/HBASE-14921 > Project: HBase > Issue Type: Sub-task > Affects Versions: 2.0.0 > Reporter: Eshcar Hillel > Attachments: CellBlocksSegmentInMemStore.pdf > > > Memory optimizations including compressed format representation and offhe= ap allocations -- This message was sent by Atlassian JIRA (v6.3.4#6332)