hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiang Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14882) Provide a Put API that adds the provided family, qualifier, value without copying
Date Mon, 05 Dec 2016 03:54:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15721141#comment-15721141
] 

Xiang Li commented on HBASE-14882:
----------------------------------

[~anoop.hbase] I uploaded the patch 005 for master branch, to address your comments and have
some questions 8-)

1. The following changes are made according to your comments
1.1. Update write(OutputStream out, boolean withTags) to avoid local copy on byte[], with
reference to ValueAndTagRewriteCell
1.2. Update headOverhead() to 
       (a) Consider array headers in heapOverhead
       (b) Use TIMESTAMP_TYPE_SIZE as a sum of size of timestamp and type
       (c) Make FIXED_OVERHEAD as a static final to be calculated when the class is initialized
1.3. Update deepClone() to return a KeyValue object
1.4. Correct the indents and all updated files are checked
1.5. A new JIRA HBASE-17254 is opened to track the possible update for alignment when calculating
heapOverhead()

2. Some questions
2.1. When calculating heapOverhead(), I think I can not make it as a whole constant value
and make heapOverhead() returns the constant directly. There are 2 parts: the first part is
FIXED_OVERHEAD, which could be constant. But the second part, the array headers for all backing
byte arrays, I have to calculate them after the instance has been created, because for family,
qualifier, value and tags, ClassSize.ARRAY is added if it is not null, while ClassSize.ARRAY
is not added if it is null.
2.2. In write(OutputStream out, boolean withTags), I return getSerializedSize(withTags) directly
as the number of bytes written. I saw you calculated len in ValueAndTagRewriteCell' write(),
by adding the size together after each write to output stream. Your method is the safest way,
while it might be more concise if getSerializedSize(withTags) is returned. Do you think it
is safe to return getSerializedSize(withTags) directly? Based on my test, it is safe, but
I am not sure if there are some conditions I did not cover. Please advice.

> Provide a Put API that adds the provided family, qualifier, value without copying
> ---------------------------------------------------------------------------------
>
>                 Key: HBASE-14882
>                 URL: https://issues.apache.org/jira/browse/HBASE-14882
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Jerry He
>            Assignee: Xiang Li
>             Fix For: 2.0.0
>
>         Attachments: HBASE-14882.master.000.patch, HBASE-14882.master.001.patch, HBASE-14882.master.002.patch,
HBASE-14882.master.003.patch, HBASE-14882.master.004.patch, HBASE-14882.master.005.patch
>
>
> In the Put API, we have addImmutable()
> {code}
>  /**
>    * See {@link #addColumn(byte[], byte[], byte[])}. This version expects
>    * that the underlying arrays won't change. It's intended
>    * for usage internal HBase to and for advanced client applications.
>    */
>   public Put addImmutable(byte [] family, byte [] qualifier, byte [] value)
> {code}
> But in the implementation, the family, qualifier and value are still being copied locally
to create kv.
> Hopefully we should provide an API that truly uses immutable family, qualifier and value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message