hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haibo Chen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-6984) MR AM to clean up temporary files from previous attempt
Date Wed, 17 Jan 2018 15:50:00 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328901#comment-16328901

Haibo Chen commented on MAPREDUCE-6984:

In TestRecovery, we always create an instance MRApp and stops it to simulate an AM failure,
which in my opinion is more readable than mucking with the file system directly in our test.
Plus, the current test is coupled with the behavior of FileOutputCommitter. We can add a similar
test in TestRecovery, wrap up the OutputCommit in a spy and verify if abortJob is called.
This way, we cover other OutputCommitters.

> MR AM to clean up temporary files from previous attempt
> -------------------------------------------------------
>                 Key: MAPREDUCE-6984
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6984
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: applicationmaster
>    Affects Versions: 3.0.0-beta1
>            Reporter: Gergo Repas
>            Assignee: Gergo Repas
>            Priority: Major
>         Attachments: MAPREDUCE-6984.000.patch, MAPREDUCE-6984.001.patch, MAPREDUCE-6984.003.patch,
MAPREDUCE-6984.004.patch, MAPREDUCE-6984.005.patch, MAPREDUCE-6984.006.patch
> When the MR AM restarts, the &#123;outputDir&#125;/_temporary/&#123;appAttemptNumber&#125;
directory remains on HDFS, even though this directory is not used during the next attempt
if the restart has been done without recovery. So if recovery is not used for the AM restart,
then the deletion of this directory can be done earlier (at the start of the next attempt).
The benefit is that more free HDFS space is available for the next attempt.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org

View raw message