hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder
Date Tue, 02 Feb 2016 03:54:40 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15127620#comment-15127620

Anoop Sam John commented on HBASE-15180:

bq.I see what you are saying. Rather than BAIS, instead a CIS, one that does Cells more natively.
That sounds good. As long as the CellIS is an IS, we can use Codec Interfaces.
bq.Rather than pass a boolean to the method to do direct or tags, could you return a different
Codec implementation? A server-side Codec and/or tags-capable (would be better if seriialization
figured if tags were present rather than a meta boolean passed in by the server)? Would we
still need to do context if serverside codec and clientside codec?
Ya let me see..  I agree that it is ugly passing the boolean throughout. As of now we dont
support passing tags from client to server and reverse.  Codec has to serialize tags when
it is Replication. So we have a new Codec (KVCodecWithTags).  But both these Codecs were using
same BaseDecoder and Cell create paths.  Let me see how we can solve this.

bq.We are pivoting on the underlying Stream being a BAIS. Will it always be a BAIS? Will it
ever be a DBB?
It can be DBB later. Once we start reading the request into DBB (pooled) - yes.
Said that we are not having any hard need for underlying IS to be BAIS.   That is the reason
why I did not add some thing like a new API in Codec where asking Decoder to work on a byte[]
or so.  COntinue that to be an IS based gives us the freedom to change the underlying data
structure.  We can make ByteBufferIS which is a Cell readable. We can direct create cell (with
out copy) over the underlying DBB. (We have OffheapCell now)

bq.Could the createCell save a copy in same way? Look see if a BAIS and if so, use its buffer
and offset creating the Cell?
You mean the createCell in CellUtil? No. even if the IS is BAIS, we can not directly make
a Cell (with out any copy). BAIS is not exposing its backing byte[] buffer. We will need indirect
way of grabbing the byte[] from it.
That is why I made the CellBAIS extending BAIS which is having extra API to create a Cell
directly from its underlying buffer(with out any copy)

bq.While reading from WAL, same path of flow is executed and then the IS wont be CellInputStream
Why not and should it?
>From that stream we can not make Cell directly with out any copy. The underlying stream
is the one from DFS. Said that CellInputStream is a stream from which we can make cell DIRECTLY
WITH OUT COPY.  The name is confusing?  From other streams also we can read cell but with

Ya I agree new configs I also dont prefer. Actually we can avoid this and any context.  We
can  have 2 paths of Decoder make in client and server. Ya we can even move it from IPCUtil
also. Let me see.

> Reduce garbage created while reading Cells from Codec Decoder
> -------------------------------------------------------------
>                 Key: HBASE-15180
>                 URL: https://issues.apache.org/jira/browse/HBASE-15180
>             Project: HBase
>          Issue Type: Sub-task
>          Components: regionserver
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>             Fix For: 2.0.0
>         Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
> In KeyValueDecoder#parseCell (Default Codec decoder) we use KeyValueUtil#iscreate to
read cells from the InputStream. Here we 1st create a byte[] of length 4 and read the cell
length and then an array of Cell's length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created on top of
a ByteArrayInputStream on top of this. By default in write path, we have MSLAB usage ON. So
while adding Cells to memstore, we will copy the Cell bytes to MSLAB memory chunks (default
2 MB size) and recreate Cells over that bytes.  So there is no issue if we create Cells over
the RPC read byte[] directly here in Decoder.  No need for 2 byte[] creation and copy for
every Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells directly from
> Same Codec path is used in client side also. There better we can avoid this direct Cell
create and continue to do the copy to smaller byte[]s path.  Plan to introduce some thing
like a CodecContext associated with every Codec instance which can say the server/client context.

This message was sent by Atlassian JIRA

View raw message