hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1252) Disk problems should be handled better by the MR framework
Date Thu, 03 May 2007 20:08:15 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12493496

Doug Cutting commented on HADOOP-1252:

One more red flag in this patch for me is that the accessor method for the the configuration
parameter "mapred.local.dir" is not in the mapred package.  If this is mapred-specific, then
it belongs in the mapred package, no?  So if we think that LocalDirAllocator might be useful
to non-mapred applications, then it should go in the fs package, but the code that creates
a LocalDirAllocator based on the value of "mapred.local.dir" surely belongs in the mapred

> Disk problems should be handled better by the MR framework
> ----------------------------------------------------------
>                 Key: HADOOP-1252
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1252
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.12.3
>            Reporter: Devaraj Das
>         Assigned To: Devaraj Das
>             Fix For: 0.13.0
>         Attachments: 1252.new.patch, 1252.patch, 1252.patch
> The MR framework should recover from Disk Failure problems without causing jobs to hang.
Note that this issue is about a short-term solution to solving the problem. For example, by
looking at the code and improving the exception handling (to better detect faulty disks and
missing files). The long term approach might be to have a FS layer that takes care of failed
disks and makes it transparent to the tasks. That will be a separate issue by itself.
> Some of the issues that have been reported are HADOOP-1087 and a comment by Koji on HADOOP-1200
(not sure whether those are all). Please add to this issue as much details as possible on
disk failures leading to hung jobs (details like relevant exception traces, way to reproduce,

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message