hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pavel <pavli...@gmail.com>
Subject help with reduce phase understanding
Date Wed, 30 Jul 2008 14:09:59 GMT
Hi,

I feel lack of mapreduce approach understanding and would like to ask some
questions (mainly on its reduce part). Below is reduce job that gets values
count for given row key and inserts resulting value into other table using
the same row key.

What makes me doubt is that I cannot figure out how would that code work if
there're several redurers are running. Is it possible that they will process
values for same row key and as consequence write stale data into the table?
Say reducerA has counted total for 5 messages while reducerB for 3 messages,
would that all end up with 8 value in resulting table?

Thank you.
Pavel

public class MessagesTableReduce extends TableReduce<Text, LongWritable> {

    public void reduce(Text key, Iterator<LongWritable> values,
            OutputCollector<Text, MapWritable> output, Reporter reporter)
            throws IOException {

        System.out.println("REDUCE: processing messages for author: " +
key.toString());

        int total = 0;
        while (values.hasNext()) {
            values.next();
            total++;
        }

        MapWritable map = new MapWritable();
        map.put(new Text("messages:sent"), new
ImmutableBytesWritable(String.valueOf(total).getBytes()));
        output.collect(key, map);
    }
}

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message