hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-3598) Map-Reduce framework needlessly creates temporary _${taskid} directories for Maps
Date Thu, 19 Jun 2008 16:07:44 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-3598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Arun C Murthy updated HADOOP-3598:
----------------------------------

    Attachment: HADOOP-3598_0_20080619.patch

Attached patch pushes the creation of the temporary output directory into the FileOutputFormat,
there-by ensuring Maps which do not produce output to HDFS do not create the temporary directories.

Is this an incompatible change? 
I think so - the framework no longer creates the directory and once FileSystem.create doens't
create the parent-directory implicitly all output-formats will need to be aware of this.
Thoughts?

Once we reach consensus, I'll attach another patch with the relevant documentation fix also.


> Map-Reduce framework needlessly creates temporary _${taskid} directories for Maps
> ---------------------------------------------------------------------------------
>
>                 Key: HADOOP-3598
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3598
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.18.0
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>            Priority: Blocker
>             Fix For: 0.18.0
>
>         Attachments: HADOOP-3598_0_20080619.patch
>
>
> The staging directory for task-outputs (i.e. ${mapred.out.dir}/_temporary/_${taskid})
should only be created when Maps produce output on HDFS, which usually isn't the case. This
plays very badly with HDFS quotas and may lead to thousands of temp names in the FS namespace,
there-by overhauling the quotas. IAC, it isn't good to needlessly create these directories.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message