hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10531) Revisit how the key byte[] is passed to HFileScanner.seekTo and reseekTo
Date Tue, 04 Mar 2014 11:54:25 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13919313#comment-13919313
] 

ramkrishna.s.vasudevan commented on HBASE-10531:
------------------------------------------------

bq.otherwise we'll add yet another copy to an already expensive part of the scanning.
I have a way to work around this.  Now as we are creating a cell here for comparision, I will
create a new KV here and that will not do any copy.
{code}
 public static class DerivedKeyValue extends KeyValue {

    private int length = 0;
    private int offset = 0;
    private byte[] b;

    public DerivedKeyValue(byte[] b, int offset, int length) {
      super(b,offset,length);
      this.b = b;
      setKeyOffset(offset);
      setKeyLength(length);
      this.length = length;
      this.offset = offset;
    }

    public void setKeyLength(int kLength) {
      this.length = kLength;
    }

    public void setKeyOffset(int kOffset) {
      this.offset = kOffset;
    }

    @Override
    public int getKeyOffset() {
        return this.offset;
    }
    
    @Override
    public byte[] getRowArray() {
      // TODO Auto-generated method stub
      return b;
    }
    
    @Override
    public int getRowOffset() {
      // TODO Auto-generated method stub
      return getKeyOffset() + Bytes.SIZEOF_SHORT;
    }
    
    @Override
    public byte[] getFamilyArray() {
      // TODO Auto-generated method stub
      return b;
    }
    
    @Override
    public byte getFamilyLength() {
      // TODO Auto-generated method stub
      return this.b[getFamilyOffset() - 1];
    }
    
    @Override
    public int getFamilyOffset() {
      // TODO Auto-generated method stub
      return this.offset  + Bytes.SIZEOF_SHORT + getRowLength() + Bytes.SIZEOF_BYTE;
    }
    
    @Override
    public byte[] getQualifierArray() {
      // TODO Auto-generated method stub
      return b;
    }
    
    @Override
    public int getQualifierLength() {
      // TODO Auto-generated method stub
      return getQualifierLength(getRowLength(),getFamilyLength());
    }
    
    @Override
    public int getQualifierOffset() {
      // TODO Auto-generated method stub
      return super.getQualifierOffset();
    }
    @Override
    public int getKeyLength() {
      // TODO Auto-generated method stub
      return length;
    }
    @Override
    public short getRowLength() {
      return Bytes.toShort(this.b, getKeyOffset());
    }
    
    private int getQualifierLength(int rlength, int flength) {
      return getKeyLength() - (int) getKeyDataStructureSize(rlength, flength, 0);
    }
}
{code}
Now here if you see the only difference between a normal Kv and the one craeted by KeyValue.createKeyValueFromKeyValue,
we actually don't need the first 8 bytes(ROW_OFFSET).  so by avoiding those bytes if we are
able to implement our own getKeyLength, getRowOffset, etc we will be able to a proper comparison.
Now we can compare the individual rows, families, qualifiers individually.  What you think?
 so we avoid byte copy but we will create a new object.  But I think that is going to be cheaper.


> Revisit how the key byte[] is passed to HFileScanner.seekTo and reseekTo
> ------------------------------------------------------------------------
>
>                 Key: HBASE-10531
>                 URL: https://issues.apache.org/jira/browse/HBASE-10531
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.99.0
>
>         Attachments: HBASE-10531.patch, HBASE-10531_1.patch
>
>
> Currently the byte[] key passed to HFileScanner.seekTo and HFileScanner.reseekTo, is
a combination of row, cf, qual, type and ts.  And the caller forms this by using kv.getBuffer,
which is actually deprecated.  So see how this can be achieved considering kv.getBuffer is
removed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message