hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Thevis <Thomas.The...@semgine.com>
Subject ID Service with HBase?
Date Wed, 16 Apr 2008 11:22:29 GMT
Hello list readers,

I'd like to perform mass data operations resulting in several output 
files with cross-references between lines in different files. For this 
purpose, I want to use a kind of ID service and I wonder whether I could 
use HBase for this task.
However, until now I was not able to use the HBase locking mechanism in 
a way that newly created IDs are unique.

The setup:
- each Mapper has its own instance of an IDSevice implementation
- each IDService instance has its own reference to the ID table in the HBase

The code snippet which is used to return and update IDs:
final String columnName = this.config.get(ID_COLUMN_ID);
final Text column = new Text(columnName);
final String tableName = this.config.get(ID_SERVICE_TABLE_ID);
final HTable table = new HTable(this.config, new Text(tableName));
final Text rowName = new Text(namespace);
final long startValue;

final long lockid = table.startUpdate(rowName);
final byte[] bytes = table.get(rowName, column);
if (bytes == null) {
     startValue = 0;
} else {
     final ByteArrayInputStream byteArrayInputStream
         = new ByteArrayInputStream(bytes);
     final LongWritable longWritable = new LongWritable();
     longWritable.readFields(new DataInputStream(byteArrayInputStream));
     startValue = longWritable.get();
final long stopValue = startValue + size;
table.put(lockid, column, new LongWritable(stopValue));

As stated above, resulting IDs are not unique, about a quarter of all 
created IDs appears several times.
Now my question: Do I use the locking mechanism the wrong way or is my 
approach to use HBase locking and synchronizing for this task completely 



View raw message