hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-5267) History server should be more robust when cleaning old jobs
Date Mon, 14 Apr 2014 14:22:16 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Jason Lowe updated MAPREDUCE-5267:

    Target Version/s: 2.5.0  (was: 3.0.0, 0.23.11)

bq. Is this issue still a problem?

Yes, it's still an unresolved problem.  An inaccessible directory placed under history/done/<yyyy>/<mm>/<dd>/
will still cause the history server to abort the cleanup process early, leaving history files
to accumulate without bound.  Agree this is unlikely to be fixed in a 0.23 release, so I targeted
it to 2.5.0 for now.

Took a quick look at the patch, and it looks mostly OK.  The FileContext change is a non-starter
for me though, as any patch to fix this should not involve changing the Hadoop filesystem

> History server should be more robust when cleaning old jobs
> -----------------------------------------------------------
>                 Key: MAPREDUCE-5267
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5267
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobhistoryserver
>    Affects Versions: 0.23.7, 2.0.4-alpha
>            Reporter: Jason Lowe
>            Assignee: Maysam Yabandeh
>         Attachments: MAPREDUCE-5267.patch, MAPREDUCE-5267.patch
> Ran across a situation where an admin user had accidentally created a directory in one
of the date directories under /mapred/history/done/ that was not readable by the historyserver
user.  That effectively prevented the history server from cleaning any jobs from that date
forward, as it hit an IOException trying to scan the directory and that aborted the entire
clean process.
> The history server should localize IOException handling to the directory/file being processed
and move on to the next entry in the list rather than aborting the entire cleaning process.

This message was sent by Atlassian JIRA

View raw message