hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amar Kamat <ama...@yahoo-inc.com>
Subject Re: MapReduce: Two Reduce Tasks
Date Wed, 16 Apr 2008 07:27:23 GMT
Chaman Singh Verma wrote:
> Hello,
>
> I think the question was slightly misinterpreted. What I meant by 3-4
> different task is that there are
> 3 different Reduce functionalities ( each reduce funtionalities could be
> done by many task slaves, may
> be 100). I want to reuse the output of Map for different types of operations
> ? From the examples, which
> I have come across contains one Map Function and One Reduce Function. I want
> one Map Function and
> 3-4 Reduce Functions which can utilize the output of Map Function.
>
>   
Your reduce code will generate x key-value pairs one for each reduce 
functionality. Encode the keys with the functionality information before 
calling output.collect(). 
You need to write your own OutputFormat. This output format should 
extend MultipleOutputFormat (see 
org.apache.hadoop.mapred.lib.MultipleOutputFormat.java) and override 
generateFileNameForKeyValue(K key, V value, String name). Use the 
functionality information from the key to rename the output file, for 
example
protected String generateFileNameForKeyValue(K key, V value, String name) {
  return name + "_" + decode(key); // decode will figure out the 
identifier for the functionality
}
Note that MultipleOutputFormat is available in Hadoop-0.17.
Amar
> Thanks,
>
> With Regards,
> Chaman Singh
>  
>
> Chaman Singh Verma wrote:
>   
>> Hello,
>>
>> I am developing some applications in which I can use the output of Map to
>> 3-4 different Reduce tasks ?
>> What is the best way to accomplish such task ? 
>>
>> Thanks.
>>
>> With regards,
>> csv
>>
>>     
>
>
> -----
> Chaman Singh Verma
> Poona, India
>   


Mime
View raw message