hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Kim <benkimkim...@gmail.com>
Subject HBase mapreduce sink - using a custom TableReducer to pass in Puts
Date Tue, 15 May 2012 01:50:04 GMT

I'm writing a mapreduce code to read a SequenceFile and write it to hbase
Normally, or what hbase tutorial tells us to do.. you would create a Put in
TableMapper and pass it to IdentityTableReducer. This in fact work for me.

But now I'm trying to separate the computations into mapper and let reducer
take care of writing to hbase.

Following is my TableReducer

public class MyTableReducer extends TableReducer<ImmutableBytesWritable,
KeyValue, ImmutableBytesWritable> {
    public void reduce(ImmutableBytesWritable key, Iterable<KeyValue>
values, Context context) throws IOException, InterruptedException {
        Put put = new Put(key.get());
        for(KeyValue kv : values) {
        context.write(key, put);

For my testing purpose, I'm writing 10 rows with 10 cells.
I added multiple cells to each Put operations (put.add(kv))
But this Reducer will only write the one last cell passed by Mapper!

Following is setup of the Job

        Job itemTableJob = prepareJob(
                inputPath, outputPath, SequenceFileInputFormat.class,
                MyMapper.class, ImmutableBytesWritable.class,
                MyTableReducerclass, ImmutableBytesWritable.class,
Writable.class, TableOutputFormat.class);

        TableMapReduceUtil.initTableReducerJob("rs_system", null,

Am I missing smting?


*Benjamin Kim*
Tel : +82 2.6400.3654* |* Mo : +82 10.5357.0521*
benkimkimben at gmail*

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message