hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiang Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14882) Provide a Put API that adds the provided family, qualifier, value without copying
Date Fri, 02 Dec 2016 12:26:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15715001#comment-15715001
] 

Xiang Li commented on HBASE-14882:
----------------------------------

Hi [~anoop.hbase], thanks for the your time and comments!

May I ask some questions about your comments´╝č
1. Regarding
bq. This extra copy can be avoided easily
Sorry that I did not get your idea. Do you mean that there is another function in KeyValueUtil
which can help in write() here but does not do the extra copy? Or do you mean to make it as
follow(I meant to use the following code, but finally made it to use KeyValueUtil#appendToByteArray()
in patch 004) ?
{code}
  public int write(OutputStream out, boolean withTags) throws IOException {
    // Key length and then value length
    out.write(Bytes.toBytes(KeyValueUtil.keyLength(this)));
    out.write(Bytes.toBytes(getValueLength()));

    // Row length and then row byte array
    out.write(Bytes.toBytes(getRowLength()));
    out.write(getRowArray(), getRowOffset(), getRowLength());

    // Family length and then family byte array
    out.write(getFamilyLength());
    out.write(getFamilyArray(), getFamilyOffset(), getFamilyLength());

    // Qualifier byte array, no qualifier length
    out.write(getQualifierArray(), getQualifierOffset(), getQualifierLength());

    // Timestamp
    out.write(Bytes.toBytes(getTimestamp()));

    // Type
    out.write(getTypeByte());

    // Value
    out.write(getValueArray(), getValueOffset(), getValueLength());

    // Tags length and tags byte array
    if (withTags && getTagsLength() > 0) {
      // Tags length
      byte[] bufferForTagsLength = new byte[2];
      Bytes.putAsShort(bufferForTagsLength, 0, getTagsLength());
      out.write(bufferForTagsLength);

      // Tags byte array
      out.write(getTagsArray(), getTagsOffset(), getTagsLength());
    }

    return getSerializedSize(withTags);
  }
{code}

2. Regarding
bq. We add size of 5 refs. All are array type. Means we have to include 5 * ClassSize.ARRAY
I put 5 * ClassSize.ARRAY when calculating heapSize() (ClassSize.sizeOf() is called), not
in heapOverhead(). Do you mean to move the ClassSize.ARRAY into heapSize()? I referred to
KeyValue, in which, ClassSize.ARRAY of bytes is included into heapSize().

3. Regarding heapOverhead() and heapSize() in KeyValue
{code}
  public long heapSize() {
    long sum = FIXED_OVERHEAD;
    /*
     * Deep object overhead for this KV consists of two parts. The first part is the KV object
     * itself, while the second part is the backing byte[]. We will only count the array overhead
     * from the byte[] only if this is the first KV in there.
     */
    return ClassSize.align(sum) +
        (offset == 0
          ? ClassSize.sizeOf(bytes, length) // count both length and object overhead
          : length);                        // only count the number of bytes
  }
{code}
heapOverhead() does not do the alignment(padding), while alignment of overhead is performed
in heapSize(). I might have a different idea: heapOverhead should do alignment before it's
return, because the space used in alignment can not be used by others. Do you think so?

> Provide a Put API that adds the provided family, qualifier, value without copying
> ---------------------------------------------------------------------------------
>
>                 Key: HBASE-14882
>                 URL: https://issues.apache.org/jira/browse/HBASE-14882
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Jerry He
>            Assignee: Xiang Li
>             Fix For: 2.0.0
>
>         Attachments: HBASE-14882.master.000.patch, HBASE-14882.master.001.patch, HBASE-14882.master.002.patch,
HBASE-14882.master.003.patch, HBASE-14882.master.004.patch
>
>
> In the Put API, we have addImmutable()
> {code}
>  /**
>    * See {@link #addColumn(byte[], byte[], byte[])}. This version expects
>    * that the underlying arrays won't change. It's intended
>    * for usage internal HBase to and for advanced client applications.
>    */
>   public Put addImmutable(byte [] family, byte [] qualifier, byte [] value)
> {code}
> But in the implementation, the family, qualifier and value are still being copied locally
to create kv.
> Hopefully we should provide an API that truly uses immutable family, qualifier and value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message