hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15366) Add doc, trace-level logging, and test around hfileblock
Date Wed, 02 Mar 2016 05:36:18 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15175070#comment-15175070

stack commented on HBASE-15366:

bq, So the bucket cache is also caching these extra bytest you mean? This is interesting.
Worth changing it I believe.

Yes. When we read from HDFS, we read the next blocks header too. Generally it makes it so
we can read a block with one seek rather than two (to be better evaluated but seems to be
the case in testing -- a cold read of a block requires a seek to read the block header to
see how to read the rest of the block.. its length, how its checksummed, etc.... we should
change this!).  This block+next header+13 bytes of EXTRA stuff is what we shove into the bucketcache
(The EXTRA stuff is meta data needed reconstituting hfileblock from its bucketcache representation
-- we should fix this (smile)).  Every hfileblock in blockcache is carrying an extra 50 bytes.

The original "Why are there 50 bytes tagged on to the end of the hfileblock?" question came
from a gentleman named Daniel Pol who is trying to go big w/ bucketcache.

Thanks for the review [~ram_krish] Let me get this doc in first before I start making changes.

> Add doc, trace-level logging, and test around hfileblock
> --------------------------------------------------------
>                 Key: HBASE-15366
>                 URL: https://issues.apache.org/jira/browse/HBASE-15366
>             Project: HBase
>          Issue Type: Sub-task
>          Components: BlockCache
>    Affects Versions: 2.0.0
>            Reporter: stack
>            Assignee: stack
>             Fix For: 2.0.0
>         Attachments: 15366.patch, 15366v2.patch, 15366v3.patch, 15366v4.patch
> What hfileblock is doing -- that it overreads when pulling in from hdfs to fetch the
header of the next block to save on seeks; that it caches the block and overread and then
adds an extra 13 bytes to the cached entry; that buckets in bucketcache have at least four
hfileblocks in them and so on -- was totally baffling me. This patch docs the class, adds
some trace-level logging so you can see if you are doing the right thing, and then adds a
test of file-backed bucketcache that checks that persistence is working.

This message was sent by Atlassian JIRA

View raw message