hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fang Xin <nusfang...@gmail.com>
Subject Re: how to overwrite output in HDFS?
Date Tue, 03 Apr 2012 12:29:04 GMT
Hi Bejoy,

Could you kindly further elaborate this? what and where should I insert?

Thank you!



On Tue, Apr 3, 2012 at 7:36 PM, Bejoy Ks <bejoy.hadoop@gmail.com> wrote:
> Hi Xin
>      In a very simple way, just include the line of code in your Driver
> class to check whether the output dir exists in hdfs, if exists delete that.
>
> Regards
> Bejoy KS
>
>
> On Tue, Apr 3, 2012 at 4:09 PM, Christoph Schmitz
> <Christoph.Schmitz@1und1.de> wrote:
>>
>> Hi Xin,
>>
>> you can derive your own output format class from one of the Hadoop
>> OutputFormats and make sure the "checkOutputSpecs" method, which usually
>> does the checking, is empty:
>>
>> -----------
>> public final class OverwritingTextOutputFormat<K, V> extends
>> TextOutputFormat<K, V> {
>>    @Override
>>    public void checkOutputSpecs(JobContext job) throws IOException {
>>          // Nothing
>>    }
>> }
>> -----------
>>
>> Regards,
>> Christoph
>>
>> -----Ursprüngliche Nachricht-----
>> Von: Fang Xin [mailto:nusfangxin@gmail.com]
>> Gesendet: Dienstag, 3. April 2012 11:35
>> An: mapreduce-user
>> Betreff: how to overwrite output in HDFS?
>>
>> Hi, all
>>
>> I'm writing my own map-reduce code using eclipse with hadoop plug-in.
>> I've specified input and output directories in the project property.
>> (two folders, namely input and output)
>>
>> My problem is that each time when I do some modification and try to
>> run it again, i have to manually delete the previous output in HDFS,
>> otherwise there will be error.
>> Can anyone kindly suggest how to just simply overwrite the result?
>>
>> Best regards,
>> Xin
>
>

Mime
View raw message