hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maysam Yabandeh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5267) History server should be more robust when cleaning old jobs
Date Sat, 01 Jun 2013 17:57:20 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13672184#comment-13672184
] 

Maysam Yabandeh commented on MAPREDUCE-5267:
--------------------------------------------

One possible fix is to change FileContext.Util#listStatus(Path) to skip the files/directories
for which it cannot access. 
{code:java}
public FileStatus[] next(final AbstractFileSystem fs, final Path p) 
  throws IOException, UnresolvedLinkException {
  return fs.listStatus(p);
}
{code}
This would be consistent with the behavior of the local file system in RawLocalFileSystem#listStatus
{code:java}
File[] names = localf.listFiles();
{code}
which returns only accessible items.

Also, I was wondering if there is already a standard way of testing HistoryFileManager on
top of hdfs. Currently, the tests in TestJobHistoryParsing.java are run on top of the local
file system and hence do not reveal the kind of bugs reported in this jira. I made a first
attempt of using MiniDFSCluster and setting its URI in remoteFS variable in conf, but it does
not seem to be picked up by HistoryFileManager.
                
> History server should be more robust when cleaning old jobs
> -----------------------------------------------------------
>
>                 Key: MAPREDUCE-5267
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5267
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobhistoryserver
>    Affects Versions: 0.23.7, 2.0.4-alpha
>            Reporter: Jason Lowe
>
> Ran across a situation where an admin user had accidentally created a directory in one
of the date directories under /mapred/history/done/ that was not readable by the historyserver
user.  That effectively prevented the history server from cleaning any jobs from that date
forward, as it hit an IOException trying to scan the directory and that aborted the entire
clean process.
> The history server should localize IOException handling to the directory/file being processed
and move on to the next entry in the list rather than aborting the entire cleaning process.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message