hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Francis.Hu <francis...@reachjunction.com>
Subject 答复: Is there any way to set Reducer to output to multi-places?
Date Mon, 02 Sep 2013 09:50:04 GMT
Thanks, Binglin

 

I found the class below that can do it :).

http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/lib/MultipleOutputs.html


 

 

 

发件人: Binglin Chang [mailto:decstery@gmail.com] 
发送时间: Monday, September 02, 2013 17:37
收件人: user@hadoop.apache.org
主题: Re: Is there any way to set Reducer to output to multi-places?

 

MultipleOutputFormat allows you to write multiple files in one reducer, but can't write output
to HDFS and Database concurrently, but I is a good example to show how you can write a customized
OutputFormat to achieve this.

Please note that for fault tolerance, a reducer may run multiple times, this may generate
redundant data, hadoop handles files using FileOutputCommitter, you need to handle database
case by yourself(e.g. insert record only if record doesn't exists). 

 

On Mon, Sep 2, 2013 at 5:11 PM, Rahul Bhattacharjee <rahul.rec.dgp@gmail.com> wrote:

This might help

http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapred/lib/MultipleOutputFormat.html

Thanks,
Rahul

 

On Mon, Sep 2, 2013 at 2:38 PM, Francis.Hu <francis.hu@reachjunction.com> wrote:

hi, All

 

Is there any way to set Reducer to output to multi-places ?  For example: a reducer's result
can be output to HDFS and Database concurrently.

 

Thanks,

Francis.Hu

 

 


Mime
View raw message