hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <tdunn...@veoh.com>
Subject Re: Reduce Output
Date Mon, 14 Apr 2008 16:46:41 GMT

Write an additional map-reduce step to join the data items together by
treating different input files differently.

OR

Write an additional map-reduce step that reads in your string values in the
map configuration method and keeps them in memory for looking up as you pass
over the output of your previous reduce step.  You won't need a reducer for
this approach, but your conversion table will have to fit into memory.

OR

Write a sequential script to read your string values and iterate over the
reduce output using conventional methods.  This works very well if you can
process your data in less time than hadoop takes to start your job.




On 4/14/08 9:42 AM, "Natarajan, Senthil" <senthil@pitt.edu> wrote:

> Hi,
> 
> I have the reduce output like this.
> 
> 1.0.2.92                206475
> 
> 1.0.2.9                   316475847
> 
> 1.0.3.93                3846495
> 
> 1.0.4.93                316975
> 
> 
> 
> But I want to display like this...
> 
> 1.0.2.92                206 475
> 
> 1.0.2.9                   316 475 847
> 
> 1.0.3.93                384 6495
> 
> 1.0.4.93                316 975
> 
> 
> 
> And each value has description associated with it something like this
> 
> 
> 
> 206         ->            TextDesp206
> 
> 475         ->            TextDesp475
> 
> 316         ->            TextDesp316
> 
> 847         ->            TextDesp847
> 
> 
> 
> So eventually I would like to see my output look like this
> 
> 
> 
> 1.0.2.92                TextDesp206 -> TextDesp475
> 1.0.2.9                   TextDesp316 -> TextDesp475 -> TextDesp847
> 
> How to do this, I tried different ways, but no luck.
> 
> public static class Reduce extends MapReduceBase implements Reducer<Text,
> IntWritable, Text, IntWritable> {
> 
>       public void reduce(Text key, Iterator<IntWritable> values,
> OutputCollector<Text, IntWritable> output, Reporter reporter) throws
> IOException {
> 
>          Text word = new Text();
> 
>         String sum = "";
> 
>         while (values.hasNext()) {
> 
>            sum += values.next().get() + " ";
> 
>         }
> 
>         //output.collect(key, new IntWritable(Integer.parseInt(sum)));
> 
>         word.set(sum);
> 
>         output.collect(word, new
> IntWritable(Integer.parseInt(key.toString())));
> 
>       }
> 
> 
> 
>     }
> 
> 
> 
> Is there any way to use Reducer and OutputCollector or any other classes to
> output like this
> 
> 
> 
> 1.0.2.92                TextDesp206 -> TextDesp475
> 
> 1.0.2.9                   TextDesp316 -> TextDesp475 -> TextDesp847
> 
> 
> 
> 
> 
> Thanks,
> Senthil


Mime
View raw message