hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Duxbury <br...@rapleaf.com>
Subject Re: ID Service with HBase?
Date Wed, 16 Apr 2008 21:27:32 GMT
HBASE-493 was created, and seems similar. It's a write-if-not- 
modified-since.

I would guess that you probably don't want to use HBase to maintain a  
distributed auto-increment. You need to think of some other approach  
the produces unique ids across concurrent access, like hash or GUID  
or something like that.

-Bryan

On Apr 16, 2008, at 2:18 PM, Jim Kellerman wrote:

> Row locks do not apply to reads, only updates. They prevent two  
> applications from updating the same row simultaneously. There is no  
> other locking mechanism in HBase. (It follows Bigtable in this  
> regard. See http://labs.google.com/papers/bigtable.html )
>
> There has been some discussion about adding a conditional write  
> (i.e. only completes successfully if the current value of the cell  
> being updated has value x), but noone has thought it important  
> enough to enter an enhancement request on the HBase Jira: https:// 
> issues.apache.org/jira/browse/HBASE
>
> By the way, you will get a more timely response to HBase questions  
> if you address them to the hbase mailing list: hbase- 
> user@hadoop.apache.org
>
> ---
> Jim Kellerman, Senior Engineer; Powerset
>
>
>> -----Original Message-----
>> From: Thomas Thevis [mailto:Thomas.Thevis@semgine.com]
>> Sent: Wednesday, April 16, 2008 4:22 AM
>> To: core-user@hadoop.apache.org
>> Subject: ID Service with HBase?
>>
>> Hello list readers,
>>
>> I'd like to perform mass data operations resulting in several
>> output files with cross-references between lines in different
>> files. For this purpose, I want to use a kind of ID service
>> and I wonder whether I could use HBase for this task.
>> However, until now I was not able to use the HBase locking
>> mechanism in a way that newly created IDs are unique.
>>
>> The setup:
>> - each Mapper has its own instance of an IDSevice implementation
>> - each IDService instance has its own reference to the ID
>> table in the HBase
>>
>> The code snippet which is used to return and update IDs:
>> [code]
>> final String columnName = this.config.get(ID_COLUMN_ID);
>> final Text column = new Text(columnName); final String
>> tableName = this.config.get(ID_SERVICE_TABLE_ID);
>> final HTable table = new HTable(this.config, new
>> Text(tableName)); final Text rowName = new Text(namespace);
>> final long startValue;
>>
>> final long lockid = table.startUpdate(rowName); final byte[]
>> bytes = table.get(rowName, column); if (bytes == null) {
>>      startValue = 0;
>> } else {
>>      final ByteArrayInputStream byteArrayInputStream
>>          = new ByteArrayInputStream(bytes);
>>      final LongWritable longWritable = new LongWritable();
>>      longWritable.readFields(new
>> DataInputStream(byteArrayInputStream));
>>      startValue = longWritable.get();
>> }
>> final long stopValue = startValue + size; table.put(lockid,
>> column, new LongWritable(stopValue)); table.commit(lockid); [/code]
>>
>> As stated above, resulting IDs are not unique, about a
>> quarter of all created IDs appears several times.
>> Now my question: Do I use the locking mechanism the wrong way
>> or is my approach to use HBase locking and synchronizing for
>> this task completely wrong?
>>
>> Thanks,
>>
>> Thomas
>>
>> No virus found in this incoming message.
>> Checked by AVG.
>> Version: 7.5.524 / Virus Database: 269.23.0/1381 - Release
>> Date: 4/16/2008 9:34 AM
>>
>>
>
> No virus found in this outgoing message.
> Checked by AVG.
> Version: 7.5.524 / Virus Database: 269.23.0/1381 - Release Date:  
> 4/16/2008 9:34 AM
>


Mime
View raw message