hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12358) Create ByteBuffer backed Cell
Date Thu, 04 Dec 2014 18:37:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14234450#comment-14234450

ramkrishna.s.vasudevan commented on HBASE-12358:

Before we decide on BB or BR, we would like to highlight some points here so that we could
decide on the behaviour of the new Cell APIs.
Assume we will be working with BBs so we would introduce getXXXBuffer() APIs and also hasArray()
in Cell itself directly. 
If we try to extend the cell or create a new Cell then everywhere we need to do instanceOf
check or do type conversion and that is why adding new APIS to Cell interface itself makes
Plan is to use this getXXXBuffer API through out the read path.
Now there are two ways to use it 
1) Use getXXXBuffer along with getXXXOffset, getXXXLength like how we use now for getXXXArray
APIs with the offset and   length. Doing so would ensure that every where in the filters and
CP one has to just replace the getXXXArray with getXXXBuffer and continue to use getXXXOffset
and getXXXLength. We would do some wrapping of the byte[] with a BB incase of KeyValue type
of cells so that getXXXBuffer along with offset and length holds true everywhere. Note that
here if hasARray is false (for KV case) then getXXXArray would also work.

2)The other way of using this is that use only getXXXBuffer() API and ensure that the BB is
always duplicated/sliced and only the portion of the total BB is returned which represents
the individual component of the Cell. In this case there is no use of getXXXOffset (as it
is going to be 0) and getXXXLength() is any way going to be the sliced BB's limit.

But in the 2nd approach we may end up in creating lot of small objects even while doing comparison.

Now the next problem that comes is what to do with the getXXXArray() APIs. If one sees hasArray()
as false (a DBB backed Cell) and uses the getXXXArray() API along with offset and length -
what should we do. Should we create a byte[] from the DBB and return it? Then in that case
what would should the getXXXOffset() return for a getXXXBuffer or getXXXArray()?

If we go with the 2nd approach then getXXXBuffer() should be clearly documented saying that
it has to be used without Offset and length and use getXXXOFfset and getXXXLength only with

Now if a Cell is backed by on heap BB then we could definitely return getXXXArray() also -
but what to return in the getXXXOffset would be determined by what appraoch to use for getXXXBuffer.
(based on (1) and (2)).

We wanted to open up this topic now so that to get some feedback on what could be an option
here.  Since it is an user facing Interface we need to be careful with this.

I would suggest that whenever a Cell is BB backed(Onheap or offheap) always hasARray would
be false in that Cell impl.
Every where we would use getXXXBuffer along with getXXXOffest and getXXXLength.  Even in case
of KV we could wrap the byte[] with BB so that we have uniformity through the read code and
we don't have too many 'if' else conditions.

When ever hasArray is false - using getXXXArray API would throw UnSupportedOperation Exception.

AS said if we want getXXXArray to be supported as per the existing way then getXXXBuffer and
getXXXOffset, getXXXLength should be clearly documented.


> Create ByteBuffer backed Cell
> -----------------------------
>                 Key: HBASE-12358
>                 URL: https://issues.apache.org/jira/browse/HBASE-12358
>             Project: HBase
>          Issue Type: Sub-task
>          Components: regionserver, Scanners
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>         Attachments: HBASE-12358.patch, HBASE-12358_1.patch, HBASE-12358_2.patch
> As part of HBASE-12224 and HBASE-12282 we wanted a Cell that is backed by BB.  Changing
the core Cell impl would not be needed as it is used in server only.  So we will create a
BB backed Cell and use it in the Server side read path. This JIRA just creates an interface
that extends Cell and adds the needed API.
> The getTimeStamp and getTypebyte() can still refer to the original Cell API only.  The
getXXxOffset() and getXXXLength() can also refer to the original Cell only.

This message was sent by Atlassian JIRA

View raw message