hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: How are version conflicts handled in HBase?
Date Sun, 07 Jun 2015 00:58:46 GMT
We are also looking at increasing timestamp resolution in a backwards
compatible way (see https://issues.apache.org/jira/browse/HBASE-8927).

If we do introduce the concept of a "timestamp multiplier" per table, then
Bryan's suggestion could be an option without the regret. Clients could
learn the timestamp multiplier from schema (or we'd have an API for that),
take the current time, multiply or shift the bits accordingly, and fill in
the least significant bits with something unique, like bits from the local
MAC address maybe, and provide the result as the timestamp for the write.
TTLs would likely still be managed at second granularity at the CF level,
or millisecond if using cell TTLs, so those trailing unique bits would
effectively be ignored.


On Sat, Jun 6, 2015 at 1:57 AM, Bryan Beaudreault <bbeaudreault@hubspot.com>
wrote:

> I wouldn't say it is recommended, but it is certainly possible to override
> the version timestamp at write time.  You might be able to use this to
> provide the uniqueness you need  (i.e. instead of using epoch timestamp,
> use one based more recently and add digits for uniqueness at the end).
>
> We've done this in the past and it worked perfectly fine, but i will say
> that eventually you usually want to take advantage of things like TTL, etc,
> and we ended up regretting the decision and migrating to a new table.
>
> Alternatively, just do uniqueness with the column qualifiers.  Your
> qualifier could be realQfBytes:guid, or something.  On the read side you
> can easily combine these together with a prefix filter.
>
> On Fri, Jun 5, 2015 at 11:55 AM Ted Yu <yuzhihong@gmail.com> wrote:
>
> > Dia:
> > Have you tried command in the following form in hbase shell ?
> >
> >   hbase> t.get 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS
> =>
> > 4}
> >
> > Cheers
> >
> > On Fri, Jun 5, 2015 at 8:50 AM, Dia Kharrat <dkharrat@gmail.com> wrote:
> >
> > > Ted: Yes, I've already read the HBase documentation, but didn't find
> > > anything that directly answered my question. My question was simply
> > whether
> > > it's possible to get cells with the same timestamp with concurrent Puts
> > to
> > > the same cell.
> > >
> > > Vlad: Thanks for the information. Yep, I've also noticed the behavior
> of
> > > the last-write-wins when testing directly in hbase shell and writing
> out
> > > two values to the same cell with an identical timestamp.
> > >
> > > So, I gather that an application writing to the same cell concurrently
> > will
> > > result in potential data loss when the write is at the exact same
> > > millisecond, correct? For my use-case, I'm not necessarily looking for
> > > uniqueness for the timestamp, but want to ensure I can still access all
> > > versions of the cell, even ones with the same timestamp.
> > >
> > > Dia
> > >
> > > On Thu, Jun 4, 2015 at 7:05 PM, Vladimir Rodionov <
> > vladrodionov@gmail.com>
> > > wrote:
> > >
> > > > >> Please read http://hbase.apache.org/book.html#_store
> > > >
> > > > How does this answer original question?
> > > >
> > > > -Vlad
> > > >
> > > > On Thu, Jun 4, 2015 at 6:30 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> > > >
> > > > > Dia:
> > > > > Please read http://hbase.apache.org/book.html#_store
> > > > >
> > > > > Cheers
> > > > >
> > > > > On Thu, Jun 4, 2015 at 6:02 PM, Vladimir Rodionov <
> > > > vladrodionov@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Yes, last write wins (with higher sequenceId). MemStore will
> > resolve
> > > > this
> > > > > > conflict and only the last
> > > > > > put will be added eventually, unless ... between these two puts
> > > > > MemStore's
> > > > > > snapshot is created.
> > > > > > I this case put #1 will be saved in  a snapshot and eventually
> will
> > > > make
> > > > > it
> > > > > > into a store file, but this is just my speculations.
> > > > > >
> > > > > > -Vlad
> > > > > >
> > > > > > On Thu, Jun 4, 2015 at 5:08 PM, Dia Kharrat <dkharrat@gmail.com>
> > > > wrote:
> > > > > >
> > > > > > > I'm trying to confirm the behavior of HBase when there
are
> > > concurrent
> > > > > > > writes to the same cell that happen at the exact same
> millisecond
> > > and
> > > > > not
> > > > > > > providing a timestamp value to the Put operations (i.e.
relying
> > on
> > > > > > current
> > > > > > > time of region server). Is it possible that such concurrent
> > writes
> > > > > result
> > > > > > > in a cell with an identical version value or does HBase
have a
> > > > > mechanism
> > > > > > to
> > > > > > > protect against that?
> > > > > > >
> > > > > > > If that's the case, my understanding is that last write
wins,
> > > > correct?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Dia
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message