hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10974) Improve DBEs read performance by avoiding byte array deep copies for key[] and value[]
Date Wed, 15 Oct 2014 15:52:35 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172499#comment-14172499
] 

Anoop Sam John commented on HBASE-10974:
----------------------------------------

{code}
       currentBuffer = buffer;
+      // Allocate every time we get a new block
+      // Would be great if from the block we know how much is key part and how
+      // much is for value part(the unencoded one). If this value exceeds we
+      // may need to do a copy
+      // TODO : Get the unencoded key length from the hfileblock
+      current.keyBuffer = ByteBuffer.allocate(currentBuffer.capacity() * 16);
{code}
We allocate a very big size buffer for the key? 'currentBuffer' is the buffer containing the
whole block data and we allocate 16 times bigger buffer! Not getting why you want this Ram.


> Improve DBEs read performance by avoiding byte array deep copies for key[] and value[]
> --------------------------------------------------------------------------------------
>
>                 Key: HBASE-10974
>                 URL: https://issues.apache.org/jira/browse/HBASE-10974
>             Project: HBase
>          Issue Type: Improvement
>          Components: Scanners
>    Affects Versions: 0.99.0
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.99.2
>
>         Attachments: HBASE-10974_1.patch
>
>
> As part of HBASE-10801, we  tried to reduce the copy of the value [] in forming the KV
from the DBEs. 
> The keys required copying and this was restricting us in using Cells and always wanted
to copy to be done.
> The idea here is to replace the key byte[] as ByteBuffer and create a consecutive stream
of the keys (currently the same byte[] is used and hence the copy).  Use offset and length
to track this key bytebuffer.
> The copy of the encoded format to normal Key format is definitely needed and can't be
avoided but we could always avoid the deep copy of the bytes to form a KV and thus use cells
effectively. Working on a patch, will post it soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message