hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guilherme Germoglio <germog...@gmail.com>
Subject Re: Problems when executing many (?) HTable.lockRow()
Date Mon, 18 May 2009 14:23:57 GMT
hello,

what is the current state of zk in hbase? and what will it be on version
0.20?  I mean "are/will we able to run hbase without zk?"

Because I think that making rowlocks perform well is essential for users and
also maybe this approach will make the code simpler, but I don't know what
are the drawbacks of creating such a dependency on zk.

thanks,

On Fri, May 15, 2009 at 2:40 AM, Nitay <nitayj@gmail.com> wrote:

> I like this a lot Guilherme. Perhaps we should open a JIRA with them so we
> can track these great ideas.
>
> On Thu, May 14, 2009 at 7:05 PM, Ryan Rawson <ryanobjc@gmail.com> wrote:
>
> > Given the non core nature, I think the api should potentially facilitate
> > this but the code should be contrib.
> >
> > On May 14, 2009 5:32 PM, "Guilherme Germoglio" <germoglio@gmail.com>
> > wrote:
> >
> > On Thu, May 14, 2009 at 3:40 PM, stack <stack@duboce.net> wrote: > No
> > consideration has been made f...
> > I think so.
> >
> > If nothing is to be changed on RowLock class, we could use the following
> > approach:
> >
> > Considering the line as is today:
> >
> > *RowLock lock = htable.lockRow(row);*
> >
> > *htable.lockRow(row)*, instead of contacting the regionserver and
> > requesting
> > a lock for the given row, it would contact zk for a lock, just as the
> lock
> > recipe<
> >
> >
> http://hadoop.apache.org/zookeeper/docs/current/recipes.html#sc_recipes_Locks
> > >[1].
> > Notice that it will be waiting for the lock just as it does today, the
> > only difference is that regionserver resources (e.g., RPC thread) aren't
> > used. After it receives the lock, htable would: (1) randomly generate a
> > lockid, (2) put an entry in a Map<lockid, zk node pathname>, (3) create a
> > RowLock using the lockid and (4) return the method.
> >
> > From this point, any operation could be performed even without passing
> > rowlock as parameter, zookeeper + the implementation of the lock recipe
> in
> > htable are now ensuring that no other client would be performing any
> > operation concurrently on the given row. [2]
> >
> > Finally, *htable.unlockRow(lock)* must be invoked, which would make
> Htable
> > delete the matching zk node (remember the Map<lockid, zk node pathname>).
> >
> > One good thing of this approach is that we don't have to worry about lock
> > leases: if the client dies, zk will notice at some point in the future
> and
> > release the lock. And if the client forgets to unlock the row, its code
> is
> > wrong. (:
> >
> > However, if we are to redesign the API, I would propose write and read
> > locks. Then htable would have two methods: HTable.lockRowForRead(row),
> > HTable.lockRowForWrite(row) [3] and the lock recipe to be implemented
> would
> > be the Shared Locks
> > recipe<
> >
> http://hadoop.apache.org/zookeeper/docs/current/recipes.html#Shared+Locks
> > >.
> >
> >
> > [1] We may design carefully how the locknode would be created according
> to
> > zk constraints on how many nodes it can manage in a single directory.
> Maybe
> > we should do something like:
> > /hbase/locks/table-name/hash-function(row)/row/{read-, write-}
> >
> > [2] I agree that we are not protecting ourselves from a malicious client
> > using HTable, who could simply "forget" to request the lock for the given
> > row and then mess everything. But this is how it's everywhere, isn't it?
> >
> > [3] Suggest better method names, please!
> >
> > > St.Ack > > > On Thu, May 14, 2009 at 9:44 AM, Guilherme Germoglio
<
> > germoglio@gmail.com > >wrote:...
> > --
> >
> > Guilherme msn: guigermoglio@hotmail.com homepage:
> > http://germoglio.googlepages.com
> >
>



-- 
Guilherme

msn: guigermoglio@hotmail.com
homepage: http://germoglio.googlepages.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message