hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder
Date Sat, 30 Jan 2016 12:37:39 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124875#comment-15124875
] 

stack commented on HBASE-15180:
-------------------------------



bq. My plan is to make a Cell aware ByteArrayInputStream which can read Cells directly from
it.

Where do we need this (trying to follow along). In current patch I see it being used inside
in IPCUtils method that returns a CellScanner -- seems odd to use this new Stream in this
method to give to the Codec which then does the CellScanner Interface.

bq. Plan to introduce some thing like a CodecContext associated with every Codec instance
which can say the server/client context.

Why we need a Context? Don't we currently make a decoder per Cell type and/or context? Then
we keep simple Codec API and any mess parsing is internal to the Codec implementation?

bq. SO u suggest renaming of the interface. That should be fine and looks better.

Yeah, I think suggested name is better but, lets spend some time on how this stuff will be
used first.

I remember being here with this Codec stuff and I kept bumping into need for a CellInputStream
but in end was able to make do with CellScanner; that was then and stuff may be different
now.

bq. To avoid the overhead of parsing tagsLength every time this was done.

Yeah. Lets move away from passing these withTags flags in the code base.. When we decode,
we should be able to cheaply figure if tags present or not; lets fix that rather than pass
extra flag all over.

bq. This was needed because of the way we have this PushbackIS. 

Shouldn't we pass the length when we create the PBIS derivative?

bq. Now any way you suggest add a new config to decide this copy or not rather than rely on
MSLAB. 

Can we ask our environment if we are on the serverside and if so, just do the non-copy and
presume that MSLAB or something else, if MSLAB is off, will assume ownership of the Cells
so we can let go of the buffer?  Doing this is a little more indirect but better I think than
having MSLAB reference in RPC.


> Reduce garbage created while reading Cells from Codec Decoder
> -------------------------------------------------------------
>
>                 Key: HBASE-15180
>                 URL: https://issues.apache.org/jira/browse/HBASE-15180
>             Project: HBase
>          Issue Type: Sub-task
>          Components: regionserver
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>             Fix For: 2.0.0
>
>         Attachments: HBASE-15180.patch, HBASE-15180_V2.patch
>
>
> In KeyValueDecoder#parseCell (Default Codec decoder) we use KeyValueUtil#iscreate to
read cells from the InputStream. Here we 1st create a byte[] of length 4 and read the cell
length and then an array of Cell's length and read in cell bytes into it and create a KV.
> Actually in server we read the reqs into a byte[] and CellScanner is created on top of
a ByteArrayInputStream on top of this. By default in write path, we have MSLAB usage ON. So
while adding Cells to memstore, we will copy the Cell bytes to MSLAB memory chunks (default
2 MB size) and recreate Cells over that bytes.  So there is no issue if we create Cells over
the RPC read byte[] directly here in Decoder.  No need for 2 byte[] creation and copy for
every Cell in request.
> My plan is to make a Cell aware ByteArrayInputStream which can read Cells directly from
it.  
> Same Codec path is used in client side also. There better we can avoid this direct Cell
create and continue to do the copy to smaller byte[]s path.  Plan to introduce some thing
like a CodecContext associated with every Codec instance which can say the server/client context.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message