hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christoph Schmitz <Christoph.Schm...@1und1.de>
Subject AW: how to overwrite output in HDFS?
Date Tue, 03 Apr 2012 10:39:55 GMT
Hi Xin,

you can derive your own output format class from one of the Hadoop OutputFormats and make
sure the "checkOutputSpecs" method, which usually does the checking, is empty:

public final class OverwritingTextOutputFormat<K, V> extends TextOutputFormat<K,
V> {
    public void checkOutputSpecs(JobContext job) throws IOException {
	  // Nothing


-----Urspr√ľngliche Nachricht-----
Von: Fang Xin [mailto:nusfangxin@gmail.com] 
Gesendet: Dienstag, 3. April 2012 11:35
An: mapreduce-user
Betreff: how to overwrite output in HDFS?

Hi, all

I'm writing my own map-reduce code using eclipse with hadoop plug-in.
I've specified input and output directories in the project property.
(two folders, namely input and output)

My problem is that each time when I do some modification and try to
run it again, i have to manually delete the previous output in HDFS,
otherwise there will be error.
Can anyone kindly suggest how to just simply overwrite the result?

Best regards,
View raw message