hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <lhofha...@yahoo.com>
Subject Re: Limited cross row transactions
Date Thu, 19 Jan 2012 03:03:07 GMT
Was thinking about that as well. That would be doable.

Would still need to be some sort of distributed transaction (in the sense there would be a
prepare/vote and commit
phase between the participating regions),but it would all be local to a single server.



________________________________
 From: Ted Yu <yuzhihong@gmail.com>
To: dev@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com> 
Sent: Wednesday, January 18, 2012 6:51 PM
Subject: Re: Limited cross row transactions
 
Still need to go over the patch, Lars.

I wonder how difficult supporting cross-region transactions in the same
region server would be.

Cheers

On Wed, Jan 18, 2012 at 5:02 PM, lars hofhansl <lhofhansl@yahoo.com> wrote:

> Filed https://issues.apache.org/jira/browse/HBASE-5229 for further
> discussion, attached a patch that does this.
>
>
> As for your point...
> The below is one way to define limited groups of rows that can participate
> in transactions (I should not have named it parent/child, that just
> confuses my point).
> Your scenario calls for global transaction (unless you have to some other
> approach to limit the scope of rows that could participate in your FK
> transactions to something less than the entire database).
>
> If every transaction is a global transaction the database will not scale.
>
> See http://www.julianbrowne.com/article/viewer/brewers-cap-theorem
> and
> http://www.cloudera.com/blog/2010/04/cap-confusion-problems-with-partition-tolerance/
>
> Also check out two phase commit failure and blocking scenarios, and Paxos'
> conditions for termination.
>
> -- Lars
>
>
> ----- Original Message -----
> From: Mikael Sitruk <mikael.sitruk@gmail.com>
> To: dev@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com>
> Cc:
> Sent: Wednesday, January 18, 2012 12:01 AM
> Subject: Re: Limited cross row transactions
>
> This is for parent child relationship, but what if there is no parent child
> relationship, but more a foreign key like relationship?
> Using this model you do a full scan to get all the index (since you don't
> know the parent, you just know the "secondary index").
> Or will you use a group ID as a prefix of parent key and "child" key? In
> this case splitting according to group may be more difficult, (due to
> different growth of groups).
> Doing this aren't we back in the headache of sharding in rdbms?
>
> Mikael.S
>
>
> On Wed, Jan 18, 2012 at 7:45 AM, lars hofhansl <lhofhansl@yahoo.com>
> wrote:
>
> > This thread is probably getting too long...
> >
> > In HBase we have to let go of "global stuff". I submit that global
> > transactions across 1000's of nodes that can fail will never work
> > adequately.
> > For that kind of consistency you will be hit in availability.
> >
> > Like Megastore the trick is in creating a local grouping of entities that
> > can participate in local transactions.
> > If you limit the (consistent) index to child entities of parent entity
> you
> > can form your index like this:
> > parentKey1...
> > parentKey1.childTableName1.indexedField1
> > parentKey1.childTableName1.indexedField2
> > ...
> > parentKey1.childTableName2.indexedField1
> > parentKey1.childTableName2.indexedField2
> > ...
> > (assuming . cannot be in any parent key or child table name here, but you
> > get the idea).
> >
> >
> > When scanning the parent you'd have to skip the index rows with a filter.
> > Within a parentKey you can find childKeys efficiently by scanning the
> > index rows.
> >
> > Since the parent and the index entries would sort together the table can
> > be pre-split (or one could have a simple prefix based balancer).
> >
> > -- Lars
> >
> > ----- Original Message -----
> > From: Mikael Sitruk <mikael.sitruk@gmail.com>
> > To: dev@hbase.apache.org
> > Cc:
> > Sent: Tuesday, January 17, 2012 3:07 PM
> > Subject: Re: Limited cross row transactions
> >
> > Well i understand the limitation now, asking to be in the same region is
> > really hard constraint.
> > Even if this is on the same RS this is not enough, because after a
> restart,
> > regions may be allocated differently and now part of the data may be in
> one
> > region under server A and the other part under server B.
> >
> > Well perhaps we need use case for better understanding, and perhaps
> finding
> > alternative.
> >
> > The first use case i was thinking of is as follow -
> > I need to insert data with different access criteria, but the data
> inserted
> > should be inserted in atomic way.
> > In RDBMS i would have two table, insert data in the first one with key#1
> > and then in the second one with key #2 then commit.
> > In HBase i need to use different column family with key #1 (for
> atomicity)
> > then to manage a kind of secondary index to map key#2 to key #1 (perhaps
> > via co-processor) to have quick access to the data of key#2.
> > Having cross row trx, i would think of sing different keys under the same
> > table (and probably different cf too), without the need to have secondary
> > index, but again with the limitation it does not seems to be easily
> > feasible.
> >
> > Mik.
> >
> > On Wed, Jan 18, 2012 at 12:22 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > > People rely on RDBMS for the transaction support.
> > >
> > > Consider the following example:
> > > A highly de-normalized schema puts related users in the same region
> where
> > > this 'limited cross row transactions' works.
> > > After some time, the region has to be split (maybe due to good business
> > > condition).
> > > What should the HBase user do now ?
> > >
> > > Cheers
> > >
> > > On Tue, Jan 17, 2012 at 2:13 PM, Mikael Sitruk <
> mikael.sitruk@gmail.com
> > > >wrote:
> > >
> > > > Ted - My 2 cents as a user.
> > > > The user should know what he is doing, this is like a 'delete'
> > operation,
> > > > this is less intuitive that the original delete in RDBMS, so the same
> > > will
> > > > be for this light transaction.
> > > > If the transaction fails because of cross region server then the
> design
> > > of
> > > > the user was wrong
> > > > if the transaction fails because of concurrent access, then he should
> > be
> > > > able to re-read and reprocess its request.
> > > > The only problem is how to make sure in advance that the different
> rows
> > > > will be in the same RS?
> > > >
> > > > Lars - is the limitation is at the region or at the region server? It
> > was
> > > > not so clear.
> > > >
> > > > Mikael.S
> > > >
> > > > On Tue, Jan 17, 2012 at 11:52 PM, Ted Yu <yuzhihong@gmail.com>
> wrote:
> > > >
> > > > > Back to original proposal:
> > > > > If client side grouping reveals that the batch of operations cannot
> > be
> > > > > supported by 'limited cross row transactions', what should the user
> > do
> > > ?
> > > > >
> > > > > Cheers
> > > > >
> > > > > On Tue, Jan 17, 2012 at 1:49 PM, Ted Yu <yuzhihong@gmail.com>
> wrote:
> > > > >
> > > > > > Whether Omid fits the bill is open to discussion.
> > > > > >
> > > > > > We should revisit HBASE-2315 and provide the support Flavio,
et
> al
> > > > need.
> > > > > >
> > > > > > Cheers
> > > > > >
> > > > > >
> > > > > > On Tue, Jan 17, 2012 at 1:41 PM, Lars George <
> > lars.george@gmail.com
> > > > > >wrote:
> > > > > >
> > > > > >> Hi Ted,
> > > > > >>
> > > > > >> Wouldn't Omid (https://github.com/yahoo/omid) help there?
Or is
> > > that
> > > > > too
> > > > > >> broad? Just curious.
> > > > > >>
> > > > > >> Lars
> > > > > >>
> > > > > >> On Jan 17, 2012, at 4:36 PM, Ted Yu wrote:
> > > > > >>
> > > > > >> > Can we collect use case for 'limited cross row transactions'
> > > first ?
> > > > > >> >
> > > > > >> > I have been thinking about (unlimited) multi-row transaction
> > > support
> > > > > in
> > > > > >> > HBase. It may not be a one-man task. But we should
definitely
> > > > > implement
> > > > > >> it
> > > > > >> > someday.
> > > > > >> >
> > > > > >> > Cheers
> > > > > >> >
> > > > > >> > On Tue, Jan 17, 2012 at 1:27 PM, lars hofhansl <
> > > lhofhansl@yahoo.com
> > > > >
> > > > > >> wrote:
> > > > > >> >
> > > > > >> >> I just committed HBASE-5203 (together with HBASE-3584
this
> > > > implements
> > > > > >> >> atomic row operations).
> > > > > >> >> Although a relatively small patch it lays the groundwork
for
> > > > > >> heterogeneous
> > > > > >> >> operations in a single WALEdit.
> > > > > >> >>
> > > > > >> >> The interesting part is that even though the code
enforced
> the
> > > > atomic
> > > > > >> >> operation to be a for single row, this is not required.
> > > > > >> >> It is enough if all involved KVs reside in the
same region.
> > > > > >> >>
> > > > > >> >> I am not saying that we should add any high level
concept to
> > > HBase
> > > > > >> (such
> > > > > >> >> as the EntityGroups of Megastore).
> > > > > >> >>
> > > > > >> >> But, with a slight addition to the API (allowing
a grouping
> of
> > > > > multiple
> > > > > >> >> row operations) client applications have all the
building
> > blocks
> > > to
> > > > > do
> > > > > >> >> limited cross row atomic operations.
> > > > > >> >> The client application would be responsible for
either
> > correctly
> > > > > >> >> pre-splitting the table, or a custom balancer has
to be
> > provided.
> > > > > >> >>
> > > > > >> >> The operation would fail if the regionserver determines
that
> it
> > > > would
> > > > > >> need
> > > > > >> >> data from multiple region servers.
> > > > > >> >>
> > > > > >> >> I think this needs at least minimal support from
HBase and
> > cannot
> > > > > >> >> (efficiently or without adding more moving parts)
by a client
> > API
> > > > > only.
> > > > > >> >>
> > > > > >> >>
> > > > > >> >> Comments? Is this worth pursuing? If so, I'll file
a jira and
> > > > > provide a
> > > > > >> >> patch.
> > > > > >> >>
> > > > > >> >> Thanks.
> > > > > >> >>
> > > > > >> >>
> > > > > >> >> -- Lars
> > > > > >> >>
> > > > > >> >>
> > > > > >>
> > > > > >>
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Mikael.S
> > > >
> > >
> >
> >
> >
> > --
> > Mikael.S
> >
> >
>
>
> --
> Mikael.S
>
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message