hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-1234) Change HBase StoreKey format
Date Tue, 17 Mar 2009 06:04:50 GMT

     [ https://issues.apache.org/jira/browse/HBASE-1234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

stack updated HBASE-1234:

    Attachment: 1234.patch

First cut at a patch.  Still has issues and tests don't pass yet.  Ryan Rawson and Jon Gray
contributed to this patch.

Here are notes on whats there so far:

Removing HStoreKey (Still present but will be deprecated).  Added in its stead,
KeyValue.  KeyValue is a wrapper around a byte array, offset and length.
KeyValue has comparators that do byte array compares cognizant of our whacky
meta and root key formats.

M b/src/java/org/apache/hadoop/hbase/HRegionInfo.java
    (getComparator): Added.  Returns the comparator to use switching off
    the HRegionInfo type.

M b/src/java/org/apache/hadoop/hbase/HTableDescriptor.java
    Use new families comparator.  Should save us on getting of family
    delimiter, then comparing up to the family delimiter only.

A b/src/java/org/apache/hadoop/hbase/KeyValue.java
    Effectively the new Key.  Key format has changed too to introduce
    Key Type.  See class comment for detail on new format.
    Mostly a bunch of static creates and utility making KeyValues.
    Comparators have stuff like compareRow, matchingRow, etc. Use
    the comparator to do stuff we used to do on the fly in past.
    Has stuf for getting row, column and timestamp offsets and lengths
    so we don't have to copy.  Comparators also take offset and lengths
    so can compare without copying.
M b/src/java/org/apache/hadoop/hbase/filter/RowFilterInterface.java
    Added overrides that take offset and length and deprecated old methods.
M b/src/java/org/apache/hadoop/hbase/io/Cell.java
M b/src/java/org/apache/hadoop/hbase/io/RowResult.java
    Added utility methods to go from KeyValue Lists, etc., to Maps of
    column to Cells.  This is the stuff we'd remove when the client/server
    API changes.  These will be hotspots when we profile.  Did it for Cells
    and RowResults.
M b/src/java/org/apache/hadoop/hbase/io/HalfHFileReader.java
    Bring over HalfHFileReader to use KeyValue.
M b/src/java/org/apache/hadoop/hbase/io/hfile/HFile.java
M b/src/java/org/apache/hadoop/hbase/io/hfile/HFileScanner.java
    Changes by Ryan to bring hfile over to KeyValue.
M b/src/java/org/apache/hadoop/hbase/regionserver/HAbstractScanner.java
    Convert to KeyValue. Make column match take a KeyValue so don't have
    to copy column out to compare it.
M b/src/java/org/apache/hadoop/hbase/regionserver/HLog.java
    Convert HLog over to KeyValue.
M b/src/java/org/apache/hadoop/hbase/regionserver/HLogEdit.java
    Redo an edit.  Make the value a KeyValue.  Move row out of key.
    We should redo log file so its hfile.
M b/src/java/org/apache/hadoop/hbase/regionserver/HLogKey.java
    Format has changed.  Moved out row.
M b/src/java/org/apache/hadoop/hbase/regionserver/HRegion.java
    Use KeyValue.  Return lists of them rather than MapWritables and
    Cell [].  Use comparators.  Added new Counter class.  Needed in
    getFull to keep up running list of versions (fixed bugs in here,
    particularly in memcache where we were doing the count all wrong).
    Added utility function to help with getFull; okToAddResult, addResult,
    hasEnoughVersions.  Cleanup.  We were doing things like checking
    for read only in two methods; parent check was enough.
    Removed all of that localput stuff -- it was weird.  May have made
    sense once (prompted by jgray).
M b/src/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
    Bring over to new regime.  In here we do some of the conversions from
    new style list of KeyValue to old style RowResult.
M b/src/java/org/apache/hadoop/hbase/regionserver/InternalScanner.java
    Comvert to KeyValue.
M b/src/java/org/apache/hadoop/hbase/regionserver/Memcache.java
    Redo as a Set of KeyValues.  Use ConcurrentSkipListSet so can
    undo all synchronization.  Means that iterators no longer fail
    fast but I think thats fine -- you get view on data at time
    Iterator was taken out.
M b/src/java/org/apache/hadoop/hbase/regionserver/Store.java
    Move over to KeyValue.  Bunch of refactoring to make it work.
M b/src/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
    Move over to KeyValue.
M b/src/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java
    Move over to KeyValue.  Removed ViableRow class.  Not needed any
    more it looks like.
M b/src/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
    Move over to KeyValue.
M b/src/java/org/apache/hadoop/hbase/util/Bytes.java
    Fix SIZEOF_BYTES. Added facility.  Most in here was done by jgray,
    in particular the stuff that provides ByteBuffer facility such as
    putInt, putLong, etc.  Ryan added the binarySearch of byte array,
    offset, and length tuples.
M b/src/java/org/apache/hadoop/hbase/util/MetaUtils.java
M b/src/java/org/apache/hadoop/hbase/util/Merge.java
    Still to do.
A b/src/test/org/apache/hadoop/hbase/TestKeyValue.java
    Test for keyvalue.
M b/src/test/org/apache/hadoop/hbase/io/hfile/TestSeekTo.java
    Mods for changes in HFile.
M b/src/test/org/apache/hadoop/hbase/regionserver/TestHMemcache.java
    Fixup of tests for Memcache.
M b/src/test/org/apache/hadoop/hbase/util/TestBytes.java
    Tests for Bytes.

> Change HBase StoreKey format
> ----------------------------
>                 Key: HBASE-1234
>                 URL: https://issues.apache.org/jira/browse/HBASE-1234
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: stack
>             Fix For: 0.20.0
>         Attachments: 1234.patch
> HBASE-859 cleaned up keys removing the need of HRegionInfo being in the context comparing
keys.  This issue is about changing the format.  Work done in HBASE-859 means changes have
been localized to HStoreKey, in particular to comparators and parse routines.  We should do
this now since 0.20.0 will require rewriting all data.
> Things to consider:
> <row> <columnfamily> <columnqualifier> <timestamp> <keytype>
> Or leave off columnfamily altogether and just write it once into the hfile metadata (All
key compares are done in the Store context so columnfamily can be safely left out of the equation;
its only when the key rises above Store that the columnfamily needs appending).
> keytype is probably a byte. Types are delete cell, delete row, delete family, delete
column?  What else?  Where should we put it?  At the end?  How should type sort?  Or should
it not be part of sort so its just the order at which we encounter the key?
> How are we going to support keys that go in out of chronological order?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message