hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod Kumar Vavilapalli (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3121) NodeManager should handle disk-failures
Date Thu, 17 Nov 2011 09:28:52 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151940#comment-13151940

Vinod Kumar Vavilapalli commented on MAPREDUCE-3121:

Okay, still a monster patch, but in a far better shape. I anticipate just one more iteration.

 - Please don't include util like APIs in NodeHealthStatus record, we want to keep the record
implementations to the bare essentials.
 - Pass DiskHandlerService everywhere instead of NodeHealthCheckerService.
 - Change DISKS_FAILED to not use 144. May be -1001.
 - Remove unused imports in ContainersLauncher.
 - Remove the commented out init() code in ContainerExecutor.
 - Rename LocalStorage to DirectoryCollection?
 - getHealthScriptTimer() belongs to the HealthScriptRunner itself. Let's make nodeHealthScriptRunner.getTimerTask()
public and drop TimerTask getHealthScriptTimer() from NodeHealthCheckerService.
 - Trivial: (java)doc NodeHealthCheckerService class.

ContainerLaunch: When all disks have failed, use the health-report in the exception *and*
also add a diagnostics to the event.
 - Same in ResourceLocalizationService
 - DiskHandlerService: when major-percentage disks are gone, log the report. (+108)

 - Take a snapshot of dirs before the health-check for startLocalizer()?
 - PublicLocalizer uses a LocalDirAllocator for downloading file. Should it instead use DiskHandlerService?
May be also check for min-percentage disks to be alive for each addResource() request. You
will need changes to FSDownload too.
 - Remove PUBCACHE_CTXT after doing above.

 - Existing log message at +120 can also list the good dirs. Bad dirs can be deduced from
the DHS logs.

 - The APIs with size are not needed or don't need the size paramater itself.
 - Take a lock on the cloned config for accesses via updateDirsInConfiguration(), getLocalPathForWrite(String
pathStr) etc. where configuration is accessed.

 - Change the default the numLocalDirs and numLogDirs to 4? Also, consolidate the constructors?
I can see the N number of constructors pattern of MiniMRCluster, let's avoid that.

 - Update to not have the removed configs.
 - Can you also add banned.users and min.user.id with the default values?

 - verifyDisksHealth(): Loop through and wait for a max of say 10 seconds for the node to
turn unhealthy.
 - waitForDiskHealthCheck(): We can capture DiskHandlerService's last report time and wait
till it changes atleast once. Of course that should be capped by a upper limit on the wait

Can you run the linux-container-executor tests: TestLinuxContainerExecutor and TestContainerManagerWithLCE?
Create a separate ticket for handling the disks that come back up online.
Create a separate ticket for having a metric for numFailedDirs.


Test plan:
 - RM stops scheduling when major-percentage of disks go bad: Done
 - Node's DiskHandler recognises bad disks: Done
 - Node's DiskHandler recognises minimum percentage of good disks : Done
 - Integration test: Run a mapreduce job (so that Shuffle is also verified), offline some
disks, run one more job and verify that both the apps pass. TODO
 - LogAggregation test: Verify that logs written on bad disks are ignored for aggregation
(augment TestLogAggregationService) TODO:
 - ContainerLaunch: Verify that
   -- new containers don't use bad directories(by testing the LOCAL_DIRS env in a custom map
job): TODO
   -- if major percentage disks turn bad,
      -- container should exit with proper exit code(should be easy with a custom application).
      -- localization for a resource fails TODO 
> NodeManager should handle disk-failures
> ---------------------------------------
>                 Key: MAPREDUCE-3121
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3121
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2, nodemanager
>    Affects Versions: 0.23.0
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Ravi Gummadi
>            Priority: Blocker
>             Fix For: 0.23.1
>         Attachments: 3121.patch, 3121.v1.1.patch, 3121.v1.patch, 3121.v2.patch
> This is akin to MAPREDUCE-2413 but for YARN's NodeManager. We want to minimize the impact
of transient/permanent disk failures on containers. With larger number of disks per node,
the ability to continue to run containers on other disks is crucial.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message