hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "zhihai xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5968) Work directory is not deleted in DistCache if Exception happen in downloadCacheObject.
Date Mon, 04 Aug 2014 23:39:12 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14085503#comment-14085503
] 

zhihai xu commented on MAPREDUCE-5968:
--------------------------------------

thanks for the comments. these are good findings.
1. Based on your suggestion, I can optimize the code: remove delWorkDir, check whether it
exist before delete the work dir in final block.
2. define "-work-"  as a constant in the class, so it can be reused by both  production and
test code.



> Work directory is not deleted in  DistCache if Exception happen in downloadCacheObject.
> ---------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5968
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5968
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv1
>    Affects Versions: 1.2.1
>            Reporter: zhihai xu
>            Assignee: zhihai xu
>         Attachments: MAPREDUCE-5968.branch1.patch
>
>
> Work directory is not deleted in  DistCache if Exception happen in downloadCacheObject.
In downloadCacheObject, the cache file will be copied to temporarily work directory first,
then the  work directory will be renamed to the final directory. If IOException happens during
the copy, the  work directory will not be deleted. This will cause garbage data left in local
disk cache. For example If the MR application use Distributed Cache to send a very large Archive/file(50G),
if the disk is full during the copy, then the IOException will be triggered, the work directory
will be not deleted or renamed and the work directory will occupy a big chunk of disk space.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message