hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hsieh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9245) Remove dead or deprecated code from hbase 0.96
Date Mon, 09 Sep 2013 18:40:52 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13762140#comment-13762140

Jonathan Hsieh commented on HBASE-9245:

Here's motivation for 0.96 vs 0.98:
1) There would be perf degradation due to api shimming at multiple locations that actually
affects the common case (see HBASE-9359, and some of the grossness in the filter api shim).
2) There are many other apis that have been removed already,  that break exiting apps so it
seemed better to get them all at this time instead of guaranteeing that it will happen again.
3) I believe the changes are fairly minimal (a type change) and efforts are being made to
minimize the impact after this type change is being done.  
4) Not doing it now blocks a whole class of optimizations from coming in as minor feature
additions until we have another chance to break the api.

Here's motivation for the overall removal of KV from the client/common api:

In 0.94, KV is a concrete implementation that is present on the serverside and the client
side. The internal structure and layout is exposed such that all kvs are locked into being
a single contiguous array/base pointer (for the entire KV) with offsets and lengths into this
base pointer for each of the major fields (row, fam, qual, value, ts).  This means each kv
has a fully copy of all these fields.

In 0.96 we've introduced encodings that break the single base pointer assumption.  The Cell
interface exposes what is essentially multiple base pointers (one for row, fam, qual, val,
just returning long for ts).  This will allow us to use more efficient encodings that can
allow us to share rows/fams/quals from multiple KV's with the a single array.  

Currently in 0.96 there are two implementations of Cell -- KeyValue (a cell backed by a flat
contiguous array), and PrefexTreeCell (a cell backed that uses multiple base pointers to share
prefixes).  Currently the PrefixTreeCell is only on the RS side (actually only at the store
file I believe) and we have to do a bunch of interpreting on the RS side to convert to KV's,
and then ship to clients.  

By changing the client/common API to only use the Cell interface, we decouple the interface
from the implementation.  This opens opportunities for push KV encodings up from the HFile
level into the scanners, and the flexiblity to send encoded kvs to the client.  

It is important to do the client api first since these will be the longest living, and other
changes from here on will be likely be internal to RS's or only additions to the rpc protocol
which should not break compatibility as future 0.96 api+wire compatible hbases come around.

> Remove dead or deprecated code from hbase 0.96
> ----------------------------------------------
>                 Key: HBASE-9245
>                 URL: https://issues.apache.org/jira/browse/HBASE-9245
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jonathan Hsieh
> This is an umbrella issue that will cover the removal or refactoring of dangling dead
code and cruft.  Some can make it into 0.96, some may have to wait for an 0.98.  The "great
culling" of code will be grouped patches that are logically related.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message