hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergey Shelukhin <ser...@hortonworks.com>
Subject Re: Purpose of versions in HBase...
Date Mon, 09 Dec 2013 17:56:11 GMT
I suspect the honest answer would be "because BigTable paper had it" :P

There are several aspects to cell versioning (I may be missing some).
First (not the most important), due to the way HBase stores things
(write-once files), it comes very cheaply - very little runtime cost, and
not so much code needs to be written to have it.
Second, internally, versioning allows for snapshot isolation (within a
server) to work - with multiple versions present, scanners can read all
ones to get a consistent view (that's MVCC).
Third, user-visible, timestamp-based cell versioning is there so that users
could control the order of things (e.g. delete all cells before...), either
thru fabricated timestamps, or using external timestamps, e.g. from
external logs. In fact, with current HBase implementation of auto-ts (there
are JIRAs to fix it), that's the only "bulletproof" way to use HBase;
internal HBase versioning relies on server clocks, which is fraught with
peril (granted, most systems will rarely hit this problems, and may be ok
with some reordering anyway).
Fourth, multi-versions as such could be used for some application specific
scenarios, Percolator paper is a good example.

On Sun, Dec 8, 2013 at 9:35 AM, Michael Segel <msegel_hadoop@hotmail.com>wrote:

> Hi,
> In a different thread, we were discussing good and better schema designs.
> In order to really understand why one should or should not do something,
> its kind of important to understand the underlying reasons why HBase was
> designed the way it was.
> So since we have a bunch of committers here, and cc'ing the Dev list,
> I'd like to explore why does HBase have cell versioning. What's its
> purpose.  How is it implemented. and Why.
> This may seem a bit esoteric, but it would go a long way in educating many
> of the users on the hbase mailing list.
> Also it may be a good couple of paragraphs to add to the online
> reference...
> -Mike

NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message