hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ravi Gummadi (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-2850) TaskTracker disk failure handling (MR-2413) has no test coverage
Date Wed, 12 Oct 2011 12:03:12 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125749#comment-13125749
] 

Ravi Gummadi commented on MAPREDUCE-2850:
-----------------------------------------

>> In prepareDirToFail it says the file is set to perms "000" but File#createNewFile
uses the default perms (eg 644 with umask 022), so it should still be accessible right?

No. The comment was wrong. I replace the directory with a file so that DiskChecker.checkDirs()
will fail because it tries to do mkdir with the same name and this will be reported as a disk
failure. Updating the comment accordingly.

>> If you want not always have waitForDiskHealthCheck wait for 10s at a time seems like
you can lower the DISK_HEALTH_CHECK_INTERVAL to eg 1s.

OK. Changing to 1 sec.

>> Would also be good to test startup with a failed directory. Feel free to punt this
to MAPREDUCE-2921.

This change would need a handle into the code of MiniMRCluster.TaskTrackerRunner() and needs
some refactoring and exposing some api in MiniMRCluster. So not doing it as part of current
JIRA.
                
> TaskTracker disk failure handling (MR-2413) has no test coverage
> ----------------------------------------------------------------
>
>                 Key: MAPREDUCE-2850
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2850
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: tasktracker
>    Affects Versions: 0.20.204.0
>            Reporter: Eli Collins
>            Assignee: Ravi Gummadi
>         Attachments: MR2850.v0.patch
>
>
> MR-2413 doesn't have any test coverage that eg tests that the TT can survive disk failure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message