hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5267) History server should be more robust when cleaning old jobs
Date Mon, 03 Jun 2013 15:13:21 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13673195#comment-13673195
] 

Jason Lowe commented on MAPREDUCE-5267:
---------------------------------------

The proposed listStatus change would help but not be sufficient to prevent problems from cropping
up.  For example, someone accidentally creating a directory under /mapred/done where  the
history server can read the directory but not delete any files underneath it would still cause
problems.  Bottom line is that HistoryFileManager.clean needs to protect itself from IOExceptions
that can occur when interacting with the filesystem and try to make cleanup progress despite
those exceptions.

The listStatus inconsistency is best handled by a separate JIRA.
                
> History server should be more robust when cleaning old jobs
> -----------------------------------------------------------
>
>                 Key: MAPREDUCE-5267
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5267
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobhistoryserver
>    Affects Versions: 0.23.7, 2.0.4-alpha
>            Reporter: Jason Lowe
>
> Ran across a situation where an admin user had accidentally created a directory in one
of the date directories under /mapred/history/done/ that was not readable by the historyserver
user.  That effectively prevented the history server from cleaning any jobs from that date
forward, as it hit an IOException trying to scan the directory and that aborted the entire
clean process.
> The history server should localize IOException handling to the directory/file being processed
and move on to the next entry in the list rather than aborting the entire cleaning process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message