hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guilherme Germoglio <germog...@gmail.com>
Subject Re: Problems when executing many (?) HTable.lockRow()
Date Fri, 15 May 2009 00:31:18 GMT
On Thu, May 14, 2009 at 3:40 PM, stack <stack@duboce.net> wrote:

> No consideration has been made for changes in how locks are done in new
> 0.20.0 API.  Want to propose something Guilherme?  Could new zk-arbitrated
> locking be done inside the confines of the RowLock object?


I think so.

If nothing is to be changed on RowLock class, we could use the following
approach:

Considering the line as is today:

*RowLock lock = htable.lockRow(row);*

*htable.lockRow(row)*, instead of contacting the regionserver and requesting
a lock for the given row, it would contact zk for a lock, just as the lock
recipe<http://hadoop.apache.org/zookeeper/docs/current/recipes.html#sc_recipes_Locks>[1].
Notice that it will be waiting for the lock just as it does today, the
only difference is that regionserver resources (e.g., RPC thread) aren't
used. After it receives the lock, htable would: (1) randomly generate a
lockid, (2) put an entry in a Map<lockid, zk node pathname>, (3) create a
RowLock using the lockid and (4) return the method.

>From this point, any operation could be performed even without passing
rowlock as parameter, zookeeper + the implementation of the lock recipe in
htable are now ensuring that no other client would be performing any
operation concurrently on the given row. [2]

Finally, *htable.unlockRow(lock)* must be invoked, which would make Htable
delete the matching zk node (remember the Map<lockid, zk node pathname>).

One good thing of this approach is that we don't have to worry about lock
leases: if the client dies, zk will notice at some point in the future and
release the lock. And if the client forgets to unlock the row, its code is
wrong. (:

However, if we are to redesign the API, I would propose write and read
locks. Then htable would have two methods: HTable.lockRowForRead(row),
HTable.lockRowForWrite(row) [3] and the lock recipe to be implemented would
be the Shared Locks
recipe<http://hadoop.apache.org/zookeeper/docs/current/recipes.html#Shared+Locks>.


[1] We may design carefully how the locknode would be created according to
zk constraints on how many nodes it can manage in a single directory. Maybe
we should do something like:
/hbase/locks/table-name/hash-function(row)/row/{read-, write-}

[2] I agree that we are not protecting ourselves from a malicious client
using HTable, who could simply "forget" to request the lock for the given
row and then mess everything. But this is how it's everywhere, isn't it?

[3] Suggest better method names, please!


> St.Ack
>
>
> On Thu, May 14, 2009 at 9:44 AM, Guilherme Germoglio <germoglio@gmail.com
> >wrote:
>
> > This way, HTable could directly request for read or write row locks (
> >
> http://hadoop.apache.org/zookeeper/docs/current/recipes.html#Shared+Locks)
> > using zookeeper wrapper. The problem is that the client api would change
> a
> > little. Would these changes fit into the client api redesign for 0.20
> > (HBASE-1249)?
> >
> > On Thu, May 14, 2009 at 11:16 AM, stack <stack@duboce.net> wrote:
> >
> > > On Wed, May 13, 2009 at 11:00 PM, Joey Echeverria <joey42@gmail.com>
> > > wrote:
> > >
> > > > Wouldn't it be better to implement the row locks using zookeeper?
> > > >
> > >
> > > THBase was done before ZK was in the mix.  Now its here, we should look
> > > into
> > > using it.
> > >
> > > St.Ack
> > >
> >
> >
> >
> > --
> > Guilherme
> >
> > msn: guigermoglio@hotmail.com
> > homepage: http://germoglio.googlepages.com
> >
>



-- 
Guilherme

msn: guigermoglio@hotmail.com
homepage: http://germoglio.googlepages.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message