hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hbase/Plan-0.2/APIChanges" by izaakrubin
Date Fri, 18 Jul 2008 21:19:52 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by izaakrubin:

New page:
= Changes to HBaseAdmin and HTable between HBase 0.1.3 and 0.2 =

The APIs for both `HBaseAdmin` and `HTable` have significantly changed between versions 0.1.3
and 0.2.  The following document summarizes the key differences that are most important for
developers wishing to use these classes.  For more information, see the full API documentations
for <0.1.3> and <0.2>

A general difference in both `HBaseAdmin` and `HTable` is a transition away from using Hadoop's
`Text` class (package `org.apache.hadoop.io`) and toward using `byte[]` as a replacement.
 All methods from 0.1.3 that returned either `Text` or `Text[]` now return either `byte[]`
or `byte[][]`.  However, HBase 0.2 has not completely abandoned the use of `Text`, and the
API is still somewhat backward-compatible.  All methods that once accepted parameters of type
`Text` are now overloaded to support `Text`, `String`, or `byte[]` as parameters.  The following
API changes from `HBaseAdmin` demonstrate this overloading:

[[BR]] {{{public void addColumn(Text, HColumnDescriptor);}}}
[[BR]] {{{public void deleteColumn(Text, Text);}}}

[[BR]] {{{public void addColumn(Text, HColumnDescriptor);}}}
[[BR]] {{{public void addColumn(String, HColumnDescriptor);}}}
[[BR]] {{{public void addColumn(byte[], HColumnDescriptor);}}}
[[BR]] {{{public void deleteColumn(Text, Text);}}}
[[BR]] {{{public void deleteColumn(String, String);}}}
[[BR]] {{{public void deleteColumn(byte[], byte[]);}}}

Aside from the method overloading described above, `HBaseAdmin` has not significantly changed.
 HBase 0.2 adds the following new methods to `HBaseAdmin`:

[[BR]] {{{public boolean isTableEnabled(Text);}}}
[[BR]] {{{public boolean isTableEnabled(String);}}}
[[BR]] {{{public boolean isTableEnabled(byte[]);}}}
[[BR]] {{{public void modifyTableMeta(byte[] HTableDescriptor);}}}

The only method from `HBaseAdmin` that no longer exists in HBase 0.2 is `checkReservedTableName(Text)`.
 These changes aside, the API and functional capability of `HBaseAdmin` has not drastically
changed between versions 0.1.3 and 0.2.

`HTable` has experienced far more significant changes to its API and functional capability.
 The biggest change is the way in which row updates are performed.  In HBase 0.1.3, `HTable`
had its own support for atomic row insertions and changes.  The following methods existed
in `HTable` and aided the updating process:

[[BR]] {{{public synchronized long startUpdate(Text);}}}
[[BR]] {{{public void put(long, Text, byte[]);}}}
[[BR]] {{{public void put(long, Text, Writable);}}}
[[BR]] {{{public void delete(long, Text);}}}
[[BR]] {{{public synchronized void checkUpdateInProgress();}}}
[[BR]] {{{public synchronized void abort(long);}}}
[[BR]] {{{public synchronized void commit(long, long);}}}
[[BR]] {{{public void commit(long);}}}

`startUpdate(Text)` was used to start an atomic row update on the passed row name, and a "lock
id" was returned to identify the update.  `put()` and `delete()` could be called using the
lock id, and any changes made could be aborted or committed.

HBase 0.2 has completely changed the way in which atomic row updates take place.  To update
a row, the user first creates a `BatchUpdate` object (new to 0.2, package org.apache.hadoop.hbase.io).
 Any put and delete operations are applied to the `BatchUpdate`, not the `HTable`.  Once finished,
the user commits the `BatchUpdate` to the `HTable` with one of the following new `HTable`

[[BR]] {{{public synchronized void commit(BatchUpdate);}}}
[[BR]] {{{public synchronized void commit(List<BatchUpdate>);}}}

Aside from the transition from Text toward `byte[]` and the addition of the `BatchUpdate`
structure, `HTable` has experienced a number of other API changes: 

 *  In 0.1.3, package-private `getRegionLocation` took parameters of `(Text)` or `(Text, boolean)`.
 In 0.2, `getRegionLocation` has become public and only accepts parameters of either `(Text)`,
`(String)`, or `(byte[])`; the boolean "reload" option no longer exists.
 * In 0.1.3, the `HTable` constructor took parameters of `(HBaseConfiguration, Text)`.  In
0.2, the `HBaseConfiguration` parameter has become optional and as such there are now 6 different
constructors (with or without the `HBaseConfiguration` parameter, and with the `Text` parameter
varying as either `Text`, `String`, or `byte[]`).  
 * `public HTableDescriptor getMetadata()` has been replaced in 0.2 by `public HTableDescriptor
getTableDescriptor()`.  `getMetadata()` still exists in 0.2, although it is marked as deprecated.
 Additionally, both `getTableDescriptor()` and `getMetadata()` now return `UnmodifyableHTableDescriptor`
(new to 0.2, package `org.apache.hadoop.hbase.client`, a subclass of `HTableDescriptor`),
although their return type is still `HTableDescriptor`.  
 * In 0.1.3, all `get(...)` methods had a return type of `byte[]`.  In 0.2, these methods
now return a `Cell` (new to 0.2, package `org.apache.hadoop.hbase.io`).
 * In 0.1.3, all `getRow(...)` methods had a return type of `SortedMap<Text, byte[]>`.
 In 0.2, these methods now return a `RowResult` (new to 0.2, package `org.apache.hadoop.hbase.io`).
 * The methods `obtainScanner(...)` from 0.1.3 have been renamed to `getScanner(...)` in 0.2.
 Moreover, `getScanner(...)`  has a return type of `Scanner` (new to 0.2, package `org.apache.hadoop.hbase.client`),
whereas `obtainScanner(...)` had a return type of `HScannerInterface` (a class that no longer
exists in 0.2).  
 * In 0.1.3, `deteleFamily` took parameters of either `(Text, Text, long)` or `(Text, Text)`.
 In 0.2, `deleteFamily` only accepts parameters of either `(Text, Text, long)`, `(String,
String, long)`, or `(byte[], byte[], long)`; the long (timestamp) must be included as a parameter.
 * The `HTable` subclass `ServerCallable<T>` from 0.1.3 has been moved to its own separate
class in 0.2, located in package `org.apache.hadoop.hbase.client`.  
 * The method `getRegionServerWithRetries(ServerCallable)` from 0.1.3 has been moved to the
`HConnection` interface in 0.2.  

Finally, the following methods have been added to HTable in HBase 0.2:

[[BR]] {{{public static boolean isTableEnabled(Text);}}}
[[BR]] {{{public static boolean isTableEnabled(String);}}}
[[BR]] {{{public static boolean isTableEnabled(byte[]);}}}
[[BR]] {{{public static boolean isTableEnabled(HBaseConfiguration, Text);}}}
[[BR]] {{{public static boolean isTableEnabled(HBaseConfiguration, String);}}}
[[BR]] {{{public static boolean isTableEnabled(HBaseConfiguration, byte[]);}}}

Once again, for greater detail on API changes to `HBaseAdmin`, `HTable`, and all classes new
to HBase 0.2, please refer to the <HBase 0.2 API>.  

View raw message