hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joey Echeverria <j...@cloudera.com>
Subject Re: get name of file in mapper output directory
Date Mon, 23 May 2011 12:42:04 GMT
Hi Mark,

FYI, I'm moving the discussion over to
mapreduce-user@hadoop.apache.org since your question is specific to

You can derive the output name from the TaskAttemptID which you can
get by calling getTaskAttemptID() on the context passed to your
cleanup() funciton. The task attempt id will look like this:


You're interested in the m_000005 part, This gets translated into the
output file name part-m-00005.


On Sat, May 21, 2011 at 8:03 PM, Mark question <markq2011@gmail.com> wrote:
> Hi,
>  I'm running a job with maps only  and I want by end of each map
> (ie.Close() function) to open the file that the current map has wrote using
> its output.collector.
>  I know "job.getWorkingDirectory()"  would give me the parent path of the
> file written, but how to get the full path or the name (ie. part-00000 or
> part-00001).
> Thanks,
> Mark

Joseph Echeverria
Cloudera, Inc.

View raw message