hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Abdelnur" <tuc...@gmail.com>
Subject Re: Output filename generation?
Date Mon, 21 Apr 2008 06:12:39 GMT
The MultipleOutputFormat was designed to do what Goel is suggesting.

On Mon, Apr 21, 2008 at 10:41 AM, Amar Kamat <amarrk@yahoo-inc.com> wrote:
> pi song wrote:
>
> > Dear hadoop mailling-list,
> >
> > Is there a way to control output filename generation? A sample use case is
> > when I want 2 MapReduce jobs to output to the same directory.
> >
> >
>  I think you need to write your own output format (see
> http://tinyurl.com/4aszgk). Look at OutputFormat.getRecordWriter(). The
> parameter *name* is what determines the output filename. One easy way would
> be to append the job-name to this *name* in OutputFormat.getRecordWriter().
>  Something like
>  public RecordWriter<WritableComparable, Writable>
> getRecordWriter(FileSystem ignored, JobConf job, String name, Progressable
> progress)
>  throws IOException {
>  name = name + "_" + job.getJobName();
>  //rest of the code .. taken from Hadoop-0.16.3
>  }
>  Amar
>
> > Pi
> >
> >
> >
>
>

Mime
View raw message