hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bejoy Ks <bejoy.had...@gmail.com>
Subject Re: how to overwrite output in HDFS?
Date Tue, 03 Apr 2012 11:36:25 GMT
Hi Xin
     In a very simple way, just include the line of code in your Driver
class to check whether the output dir exists in hdfs, if exists delete that.

Regards
Bejoy KS

On Tue, Apr 3, 2012 at 4:09 PM, Christoph Schmitz <
Christoph.Schmitz@1und1.de> wrote:

> Hi Xin,
>
> you can derive your own output format class from one of the Hadoop
> OutputFormats and make sure the "checkOutputSpecs" method, which usually
> does the checking, is empty:
>
> -----------
> public final class OverwritingTextOutputFormat<K, V> extends
> TextOutputFormat<K, V> {
>    @Override
>    public void checkOutputSpecs(JobContext job) throws IOException {
>          // Nothing
>    }
> }
> -----------
>
> Regards,
> Christoph
>
> -----Urspr√ľngliche Nachricht-----
> Von: Fang Xin [mailto:nusfangxin@gmail.com]
> Gesendet: Dienstag, 3. April 2012 11:35
> An: mapreduce-user
> Betreff: how to overwrite output in HDFS?
>
> Hi, all
>
> I'm writing my own map-reduce code using eclipse with hadoop plug-in.
> I've specified input and output directories in the project property.
> (two folders, namely input and output)
>
> My problem is that each time when I do some modification and try to
> run it again, i have to manually delete the previous output in HDFS,
> otherwise there will be error.
> Can anyone kindly suggest how to just simply overwrite the result?
>
> Best regards,
> Xin
>

Mime
View raw message