hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Beaudreault <bbeaudrea...@hubspot.com>
Subject Re: How are version conflicts handled in HBase?
Date Fri, 05 Jun 2015 15:57:49 GMT
I wouldn't say it is recommended, but it is certainly possible to override
the version timestamp at write time.  You might be able to use this to
provide the uniqueness you need  (i.e. instead of using epoch timestamp,
use one based more recently and add digits for uniqueness at the end).

We've done this in the past and it worked perfectly fine, but i will say
that eventually you usually want to take advantage of things like TTL, etc,
and we ended up regretting the decision and migrating to a new table.

Alternatively, just do uniqueness with the column qualifiers.  Your
qualifier could be realQfBytes:guid, or something.  On the read side you
can easily combine these together with a prefix filter.

On Fri, Jun 5, 2015 at 11:55 AM Ted Yu <yuzhihong@gmail.com> wrote:

> Dia:
> Have you tried command in the following form in hbase shell ?
>
>   hbase> t.get 'r1', {COLUMN => 'c1', TIMERANGE => [ts1, ts2], VERSIONS =>
> 4}
>
> Cheers
>
> On Fri, Jun 5, 2015 at 8:50 AM, Dia Kharrat <dkharrat@gmail.com> wrote:
>
> > Ted: Yes, I've already read the HBase documentation, but didn't find
> > anything that directly answered my question. My question was simply
> whether
> > it's possible to get cells with the same timestamp with concurrent Puts
> to
> > the same cell.
> >
> > Vlad: Thanks for the information. Yep, I've also noticed the behavior of
> > the last-write-wins when testing directly in hbase shell and writing out
> > two values to the same cell with an identical timestamp.
> >
> > So, I gather that an application writing to the same cell concurrently
> will
> > result in potential data loss when the write is at the exact same
> > millisecond, correct? For my use-case, I'm not necessarily looking for
> > uniqueness for the timestamp, but want to ensure I can still access all
> > versions of the cell, even ones with the same timestamp.
> >
> > Dia
> >
> > On Thu, Jun 4, 2015 at 7:05 PM, Vladimir Rodionov <
> vladrodionov@gmail.com>
> > wrote:
> >
> > > >> Please read http://hbase.apache.org/book.html#_store
> > >
> > > How does this answer original question?
> > >
> > > -Vlad
> > >
> > > On Thu, Jun 4, 2015 at 6:30 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> > >
> > > > Dia:
> > > > Please read http://hbase.apache.org/book.html#_store
> > > >
> > > > Cheers
> > > >
> > > > On Thu, Jun 4, 2015 at 6:02 PM, Vladimir Rodionov <
> > > vladrodionov@gmail.com>
> > > > wrote:
> > > >
> > > > > Yes, last write wins (with higher sequenceId). MemStore will
> resolve
> > > this
> > > > > conflict and only the last
> > > > > put will be added eventually, unless ... between these two puts
> > > > MemStore's
> > > > > snapshot is created.
> > > > > I this case put #1 will be saved in  a snapshot and eventually will
> > > make
> > > > it
> > > > > into a store file, but this is just my speculations.
> > > > >
> > > > > -Vlad
> > > > >
> > > > > On Thu, Jun 4, 2015 at 5:08 PM, Dia Kharrat <dkharrat@gmail.com>
> > > wrote:
> > > > >
> > > > > > I'm trying to confirm the behavior of HBase when there are
> > concurrent
> > > > > > writes to the same cell that happen at the exact same millisecond
> > and
> > > > not
> > > > > > providing a timestamp value to the Put operations (i.e. relying
> on
> > > > > current
> > > > > > time of region server). Is it possible that such concurrent
> writes
> > > > result
> > > > > > in a cell with an identical version value or does HBase have
a
> > > > mechanism
> > > > > to
> > > > > > protect against that?
> > > > > >
> > > > > > If that's the case, my understanding is that last write wins,
> > > correct?
> > > > > >
> > > > > > Thanks,
> > > > > > Dia
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message