hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Gray" <jl...@streamy.com>
Subject RE: HBase concurrent access and BatchUpdate obj
Date Wed, 06 Aug 2008 16:02:01 GMT
Jean-Adrien,

Interesting timing of your post.

I just filed HBASE-798 last night
(https://issues.apache.org/jira/browse/HBASE-798), Provide Client API to
explicitly lock and unlock rows.

This feature that I'm currently working on would expose the existing row
locking system used internally within HBase.  This would provide you with a
mechanism to: lock the row, get the current column value, increment and
write the new value, unlock the row.

Currently simple reads _do not_ row lock, so if you wanted to protect this
particular table/family/column/whatever you would need to also make every
read through a new version that did obtain a lock on the row every time.

Another option for you (if you only have this one very particular use case
of incrementing a column) would be to write this "increment column value" as
a new client method internal to HBase that could just handle this lock,
read, write, unlock.  I have a sophisticated caching layer on top of HBase
including custom queries like joins/merges/nots/etc so we're heavy on
application level logic.  Exposing the locks to our code allows us to do all
this logic in our own code.

And a final option would be using an external locking system, as you
mentioned, but for something this simple I don't think that's necessary.

Jon

-----Original Message-----
From: Jean-Adrien [mailto:adv1@jeanjean.ch] 
Sent: Wednesday, August 06, 2008 3:54 AM
To: hbase-user@hadoop.apache.org
Subject: HBase concurrent access and BatchUpdate obj


Hello,

I just read the message about concurrent access in this mailing list
http://mail-archives.apache.org/mod_mbox/hadoop-hbase-user/200804.mbox/%3c84
E2AE771361E9419DD0EFBD31F09C4D4F5A4E89B9@EXVMBX015-1.exch015.msoutlookonline
.net%3e

and I just want to verify this:

If I use the future new version of hbase, i.e. 0.2.0 which enables the
BatchUpdate objects.
All my update are made in an atomic way. Ok. But there is no lock mechanism
to avoid write-after-write fault, if, for example, I would like to have a
code like: ( If havn't found the 0.2.0 api online, so its just pseudo code )

// load a cell from hbase
Cell c = table.get("row", "column_family:column");
// make some computation with the cell content
int oldValue = Integer.parseInt(c.getContent());
int newValue = oldValue + 1;
// update the cell
BatchUpdate bu = new BatchUpdate(c);
c.setContent(String.valueOf(newContent));
bu.commit();

That is, take the current value of a cell, update it in a context-dependant
toward cell content, and save it.

I guess there is NO mechanism ensuring that another client that runs the
same code as the same time, got the same value for the cell than me and
increments it. It means that rather than having cell value sequence 1 -> 2
-> 3 -> 4 it takes the values 1 -> 2 -> 2 -> 3.

Am I right ? 

Should a lock mechanism be provided to the user through another system such
ZooKeeper or is it in the goal of hbase to provide such lock system ?

Thanks a lot.
Have a nice day

-- 
View this message in context:
http://www.nabble.com/HBase-concurrent-access-and-BatchUpdate-obj-tp18848763
p18848763.html
Sent from the HBase User mailing list archive at Nabble.com.


Mime
View raw message