hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Purpose of versions in HBase...
Date Tue, 10 Dec 2013 07:54:41 GMT
On Mon, Dec 9, 2013 at 3:17 PM, Michael Segel <michael_segel@hotmail.com>wrote:

> I believe there's a bit more to it...
>
>
Such as?



> Which is why I am asking.
>


> As to #3... What happens to a column when you put a tombstone marker on it?
>
>
We have this in the doc.  If it does not answer your question, lets fix it.

Thanks Michael.
St.Ack



> On Dec 9, 2013, at 11:56 AM, Sergey Shelukhin <sergey@hortonworks.com>
> wrote:
>
> > I suspect the honest answer would be "because BigTable paper had it" :P
> >
> > There are several aspects to cell versioning (I may be missing some).
> > First (not the most important), due to the way HBase stores things
> > (write-once files), it comes very cheaply - very little runtime cost, and
> > not so much code needs to be written to have it.
> > Second, internally, versioning allows for snapshot isolation (within a
> > server) to work - with multiple versions present, scanners can read all
> > ones to get a consistent view (that's MVCC).
> > Third, user-visible, timestamp-based cell versioning is there so that
> users
> > could control the order of things (e.g. delete all cells before...),
> either
> > thru fabricated timestamps, or using external timestamps, e.g. from
> > external logs. In fact, with current HBase implementation of auto-ts
> (there
> > are JIRAs to fix it), that's the only "bulletproof" way to use HBase;
> > internal HBase versioning relies on server clocks, which is fraught with
> > peril (granted, most systems will rarely hit this problems, and may be ok
> > with some reordering anyway).
> > Fourth, multi-versions as such could be used for some application
> specific
> > scenarios, Percolator paper is a good example.
> >
> >
> >
> > On Sun, Dec 8, 2013 at 9:35 AM, Michael Segel <msegel_hadoop@hotmail.com
> >wrote:
> >
> >>
> >> Hi,
> >>
> >> In a different thread, we were discussing good and better schema
> designs.
> >> In order to really understand why one should or should not do something,
> >> its kind of important to understand the underlying reasons why HBase was
> >> designed the way it was.
> >>
> >> So since we have a bunch of committers here, and cc'ing the Dev list,
> >>
> >> I'd like to explore why does HBase have cell versioning. What's its
> >> purpose.  How is it implemented. and Why.
> >>
> >> This may seem a bit esoteric, but it would go a long way in educating
> many
> >> of the users on the hbase mailing list.
> >>
> >> Also it may be a good couple of paragraphs to add to the online
> >> reference...
> >>
> >> -Mike
> >>
> >>
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
>
> The opinions expressed here are mine, while they may reflect a cognitive
> thought, that is purely accidental.
> Use at your own risk.
> Michael Segel
> michael_segel (AT) hotmail.com
>
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message