hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joey Echeverria <j...@cloudera.com>
Subject Re: get name of file in mapper output directory
Date Mon, 23 May 2011 12:42:04 GMT
Hi Mark,

FYI, I'm moving the discussion over to
mapreduce-user@hadoop.apache.org since your question is specific to
MapReduce.

You can derive the output name from the TaskAttemptID which you can
get by calling getTaskAttemptID() on the context passed to your
cleanup() funciton. The task attempt id will look like this:

attempt_200707121733_0003_m_000005_0

You're interested in the m_000005 part, This gets translated into the
output file name part-m-00005.

-Joey

On Sat, May 21, 2011 at 8:03 PM, Mark question <markq2011@gmail.com> wrote:
> Hi,
>
>  I'm running a job with maps only  and I want by end of each map
> (ie.Close() function) to open the file that the current map has wrote using
> its output.collector.
>
>  I know "job.getWorkingDirectory()"  would give me the parent path of the
> file written, but how to get the full path or the name (ie. part-00000 or
> part-00001).
>
> Thanks,
> Mark
>



-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

Mime
View raw message