hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Busbey <bus...@cloudera.com>
Subject Re: Nondeterministic outcome based on cell TTL and major compaction event order
Date Fri, 17 Apr 2015 23:52:17 GMT
If you have max versions set to 1 (the default), then c1 should be removed
at compaction time if c2 still exists then.

-- 
Sean
On Apr 17, 2015 6:41 PM, "Michael Segel" <michael_segel@hotmail.com> wrote:

> Ok,
> So then if you have a previous cell (c1) and you insert a new cell c2 that
> has a TTL of lets say 5 mins, then c1 should always exist?
> That is my understanding but from Cosmin’s post, he’s saying its
> different.  And that’s why I don’t understand.  You couldn’t lose the cell
> c1 at all.
> Compaction or no compaction.
>
> That’s why I’m confused.  Current behavior doesn’t match the expected
> contract.
>
> -Mike
>
> > On Apr 17, 2015, at 4:37 PM, Andrew Purtell <apurtell@apache.org> wrote:
> >
> > The way TTLs work today is they define the interval of time a cell
> > exists - exactly as that. There is no tombstone laid like a normal
> > delete. Once the TTL elapses the cell just ceases to exist to normal
> > scanners. The interaction of expired cells, multiple versions, minimum
> > versions, raw scanners, etc. can be confusing. We can absolutely
> > revisit this.
> >
> > A cell with an expired TTL could be treated as the combination of
> > tombstone and the most recent value it lays over. This is not how the
> > implementation works today, but could be changed for an upcoming major
> > version like 2.0 if there's consensus to do it.
> >
> >
> >> On Apr 10, 2015, at 7:26 AM, Cosmin Lehene <clehene@adobe.com> wrote:
> >>
> >> I've been initially puzzled by this, although I realize how it's likely
> as designed.
> >>
> >>
> >> The cell TTL expiration and compactions events can lead to either some
> (the older) data left or no data at all for a particular  (row, family,
> qualifier, ts) coordinate.
> >>
> >>
> >>
> >> Write (r1, f1, q1, v1, 1)
> >>
> >> Write (r1, f1, q1, v1, 2) - TTL=1 minute
> >>
> >>
> >> Scenario 1:
> >>
> >>
> >> If a major compaction happens within a minute
> >>
> >>
> >> it will remove (r1, f1, q1, v1, 1)
> >>
> >> then after a minute (r1, f1, q1, v1, 2) will expire
> >>
> >> no data left
> >>
> >>
> >> Scenario 2:
> >>
> >>
> >> A minute passes
> >>
> >> (r1, f1, q1, v1, 2) expires
> >>
> >> Compaction runs..
> >>
> >> (r1, f1, q1, v1, 1) remains
> >>
> >>
> >>
> >> This seems, by and large expected behavior, but it still seems
> "uncomfortable" that the (overall) outcome is not decided by me, but by a
> chance of event ordering.
> >>
> >>
> >> I wonder we'd want this to behave differently (perhaps it has been
> discussed already), but if not, it's worth a more detailed documentation in
> the book.
> >>
> >>
> >> What do you think?
> >>
> >>
> >> Cosmin
> >>
> >>
> >>
> >>
> >
> > --
> > Best regards,
> >
> >   - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet
> > Hein (via Tom White)
> >
>
> The opinions expressed here are mine, while they may reflect a cognitive
> thought, that is purely accidental.
> Use at your own risk.
> Michael Segel
> michael_segel (AT) hotmail.com
>
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message