Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 51B6718B70 for ; Wed, 9 Mar 2016 02:53:41 +0000 (UTC) Received: (qmail 39971 invoked by uid 500); 9 Mar 2016 02:53:41 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 39907 invoked by uid 500); 9 Mar 2016 02:53:41 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 39879 invoked by uid 99); 9 Mar 2016 02:53:41 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Mar 2016 02:53:41 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id E38812C1F5D for ; Wed, 9 Mar 2016 02:53:40 +0000 (UTC) Date: Wed, 9 Mar 2016 02:53:40 +0000 (UTC) From: "Enis Soztutar (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-15180) Reduce garbage created while reading Cells from Codec Decoder MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15186384#comment-15186384 ] Enis Soztutar commented on HBASE-15180: --------------------------------------- The patch looks much cleaner than v4 I think. Let's add a javadoc to this saying that the ownership / lifecycle of the passed BB is transferred to the Decoder and Cells returned from the decoder, so that the contract is explicit: {code} + Decoder getDecoder(ByteBuffer buf); {code} Again, the contract for the {{IPCUtil. createCellScanner()}} used by client side vs server side should be explicit. How do we even know that one is used only by the client or server? We can declare it in the method name explicitly, like {{createCellScannerReusingBuffers()}} (sorry for the horrible name). Otherwise looks good. [~saint.ack@gmail.com] give it a quick glance? > Reduce garbage created while reading Cells from Codec Decoder > ------------------------------------------------------------- > > Key: HBASE-15180 > URL: https://issues.apache.org/jira/browse/HBASE-15180 > Project: HBase > Issue Type: Sub-task > Components: regionserver > Affects Versions: 0.98.0 > Reporter: Anoop Sam John > Assignee: Anoop Sam John > Fix For: 2.0.0 > > Attachments: HBASE-15180.patch, HBASE-15180_V2.patch, HBASE-15180_V4.patch, HBASE-15180_V6.patch, HBASE-15180_V7.patch > > > In KeyValueDecoder#parseCell (Default Codec decoder) we use KeyValueUtil#iscreate to read cells from the InputStream. Here we 1st create a byte[] of length 4 and read the cell length and then an array of Cell's length and read in cell bytes into it and create a KV. > Actually in server we read the reqs into a byte[] and CellScanner is created on top of a ByteArrayInputStream on top of this. By default in write path, we have MSLAB usage ON. So while adding Cells to memstore, we will copy the Cell bytes to MSLAB memory chunks (default 2 MB size) and recreate Cells over that bytes. So there is no issue if we create Cells over the RPC read byte[] directly here in Decoder. No need for 2 byte[] creation and copy for every Cell in request. > My plan is to make a Cell aware ByteArrayInputStream which can read Cells directly from it. > Same Codec path is used in client side also. There better we can avoid this direct Cell create and continue to do the copy to smaller byte[]s path. Plan to introduce some thing like a CodecContext associated with every Codec instance which can say the server/client context. -- This message was sent by Atlassian JIRA (v6.3.4#6332)