hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17874) Limiting of read request response size based on block size may go wrong when blocks are read from onheap or off heap bucket cache
Date Wed, 03 May 2017 10:21:04 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15994628#comment-15994628

Anoop Sam John commented on HBASE-17874:

UT seems very difficult Stack :-(..  We have to make it such that one row sits in one slot
and another in a diff one and we read both in a multi get req.  The bucket slots has to be
in different physical ByteBuffer (Eac of size 4 MB fixed) under the BC's ByteBufferArray.
 This seems too much playing with internal of BC and am not sure how to make it to a UT :-(

> Limiting of read request response size based on block size may go wrong when blocks are
read from onheap or off heap bucket cache
> ---------------------------------------------------------------------------------------------------------------------------------
>                 Key: HBASE-17874
>                 URL: https://issues.apache.org/jira/browse/HBASE-17874
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.0.0
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>            Priority: Blocker
>             Fix For: 2.0.0
>         Attachments: HBASE-17874.patch
> HBASE-14978 added this size limiting so as to make sure the multi read requests do not
retain two many blocks. This works well when the blocks are obtained from any where other
than memory mode BucketCache. In case of on heap or off heap Bucket Cache, the entire cache
area is split into N ByteBuffers each of size 4 MB. When we hit a block in this cache, we
no longer do copy data into temp array. We use the same shared memory (BB).  Its  capacity
is 4 MB.
> The block size accounting logic is RSRpcServices is like below
> {code}
>  if (c instanceof ByteBufferCell) {
>           ByteBufferCell bbCell = (ByteBufferCell) c;
>           ByteBuffer bb = bbCell.getValueByteBuffer();
>           if (bb != lastBlock) {
>             context.incrementResponseBlockSize(bb.capacity());
>             lastBlock = bb;
>           }
>         } else {
>           // We're using the last block being the same as the current block as
>           // a proxy for pointing to a new block. This won't be exact.
>           // If there are multiple gets that bounce back and forth
>           // Then it's possible that this will over count the size of
>           // referenced blocks. However it's better to over count and
>           // use two rpcs than to OOME the regionserver.
>           byte[] valueArray = c.getValueArray();
>           if (valueArray != lastBlock) {
>             context.incrementResponseBlockSize(valueArray.length);
>             lastBlock = valueArray;
>           }
>         }
> 		{code}
> We take the BBCell's value buffer and takes its capacity. The cell is backed by the same
BB that backs the HFileBlock. When the HFileBlock is created from the BC, we do as below duplicating
and proper positioning and limiting the BB
> {code}
>  ByteBuffer bb = buffers[i].duplicate();
>       if (i == startBuffer) {
>         cnt = bufferSize - startBufferOffset;
>         if (cnt > len) cnt = len;
>         bb.limit(startBufferOffset + cnt).position(startBufferOffset);
> 		{code}
> Still this BB's capacity is 4 MB.
> This will make the size limit breach to happen too soon. What we expect is block size
defaults to 64 KB and so we here by allow cells from different blocks to appear in response.
We have a way to check whether we move from one block to next.
> {code}
> if (bb != lastBlock) {
> ...
>             lastBlock = bb;
> }
> {code}
> But already just by considering the 1st cell, we added 4 MB size!

This message was sent by Atlassian JIRA

View raw message