Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Wed, 2 Mar 2016 05:36:18 +0000 (UTC)
From: "stack (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.12945739.1456815250000.191116.1456896978265@Atlassian.JIRA>
In-Reply-To: <JIRA.12945739.1456815250000@Atlassian.JIRA>
References: <JIRA.12945739.1456815250000@Atlassian.JIRA>
 <JIRA.12945739.1456815250911@arcas>
Subject: [jira] [Commented] (HBASE-15366) Add doc, trace-level logging, and
 test around hfileblock
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HBASE-15366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15175070#comment-15175070 ] 

stack commented on HBASE-15366:
-------------------------------

bq, So the bucket cache is also caching these extra bytest you mean? This is interesting. Worth changing it I believe.

Yes. When we read from HDFS, we read the next blocks header too. Generally it makes it so we can read a block with one seek rather than two (to be better evaluated but seems to be the case in testing -- a cold read of a block requires a seek to read the block header to see how to read the rest of the block.. its length, how its checksummed, etc.... we should change this!).  This block+next header+13 bytes of EXTRA stuff is what we shove into the bucketcache (The EXTRA stuff is meta data needed reconstituting hfileblock from its bucketcache representation -- we should fix this (smile)).  Every hfileblock in blockcache is carrying an extra 50 bytes.

The original "Why are there 50 bytes tagged on to the end of the hfileblock?" question came from a gentleman named Daniel Pol who is trying to go big w/ bucketcache.

Thanks for the review [~ram_krish] Let me get this doc in first before I start making changes. Thanks.


> Add doc, trace-level logging, and test around hfileblock
> --------------------------------------------------------
>
>                 Key: HBASE-15366
>                 URL: https://issues.apache.org/jira/browse/HBASE-15366
>             Project: HBase
>          Issue Type: Sub-task
>          Components: BlockCache
>    Affects Versions: 2.0.0
>            Reporter: stack
>            Assignee: stack
>             Fix For: 2.0.0
>
>         Attachments: 15366.patch, 15366v2.patch, 15366v3.patch, 15366v4.patch
>
>
> What hfileblock is doing -- that it overreads when pulling in from hdfs to fetch the header of the next block to save on seeks; that it caches the block and overread and then adds an extra 13 bytes to the cached entry; that buckets in bucketcache have at least four hfileblocks in them and so on -- was totally baffling me. This patch docs the class, adds some trace-level logging so you can see if you are doing the right thing, and then adds a test of file-backed bucketcache that checks that persistence is working.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)