hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Binglin Chang (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-5381) Support graceful decommission of tasktracker
Date Thu, 26 Sep 2013 03:44:06 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-5381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Binglin Chang updated MAPREDUCE-5381:
-------------------------------------

    Attachment: MAPREDUCE-5381-graceful-decomm.v1.patch

Attach demo patch for graceful decommission TaskTracker, changes:
1. Add mradmin cmd: -decommission <host> to gracefully decommission all TaskTrackers
running on host, this command's effect is: TaskTracker's slot capacity will be change  to
0, so it will not accept new tasks, then it will wait all jobs running on this TaskTracker
to finish, then stop.
2. this patch depends on MAPREDUCE-4900, it add a new API(decommission) to DynamicResourceProtocol
and JobTrackerJMXBean.
3. test will be included after we agree on interface and implementation.

Approach:
Add a new field: runningJobs in TaskTrackerStatus, which is included in TaskTracker heartbeat.
When a decommission command is invoked, JobTracker change related TaskTrackers' slot capacity
to 0 first, and then wait their runningJobs counter become 0, which means all jobs running
on those TaskTracker are finished and cleaned up. JobTracker then shutdown TaskTracker by
rejecting TaskTracker heartbeat. 
                
> Support graceful decommission of tasktracker
> --------------------------------------------
>
>                 Key: MAPREDUCE-5381
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5381
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mrv1
>    Affects Versions: 1.2.0
>            Reporter: Luke Lu
>            Assignee: Binglin Chang
>         Attachments: MAPREDUCE-5381-graceful-decomm.v1.patch
>
>
> When TTs are decommissioned for non-fault reasons (capacity change etc.), it's desirable
to minimize the impact to running jobs.
> Currently if a TT is decommissioned, all running tasks on the TT need to be rescheduled
on other TTs. Further more, for finished map tasks, if their map output are not fetched by
the reducers of the job, these map tasks will need to be rerun as well.
> We propose to introduce a mechanism to optionally gracefully decommission a tasktracker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message