hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amar Kamat (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5643) Ability to blacklist tasktracker
Date Wed, 22 Apr 2009 08:42:47 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12701438#action_12701438

Amar Kamat commented on HADOOP-5643:

I think calling this as blacklisting will lead to more confusion. As Owen suggested we can
call it as *decommissioning/recommissioning* of trackers which would essentially mean that
irrespective of what state the tracker is, the jobtracker is asked to decommission(rerun+ignore)/recommission(add
back) it. So the command would be

_bin/hadoop jobtracker -decommission tracker1,tracker2...._ and _bin/hadoop jobtracker -recommission

All the running tasks  (also completed maps) that were launched on that machine will be killed
and rerun. We can reuse the lost-tracker code for doing this. Maybe a thread should be started
on demand (similar to cleanup queue thread) for a decommissioning request. Also these tracker
will be added to the ignore list (i.e issue a 'shutdown' upon contact). So a decommission
request is equivalent to lost-tracker + add-to-ignore-list. 

Upon a recommission, the trackers will be removed from the ignore list. This can be done inline.

>From the webui, a simple checkbox against all the trackers can be provided and an action
named 'Decommission' can be provided (similar to actions for jobs on jobtracker.jsp). On the
trackers page, we can provide another section for decommissioned trackers and there we can
provide a checkbox for recommissioning it.

Note :
1) Acls check should be done before decommissioning and recommissioning.
2) This info needs to be persisted. Upon every decommission/recommission, persist this info
to system.dir/jobtracker.info
3) Upon restart, the ignore list will also be recovered and loaded (i.e invoke jobtracker.decommission(recovered-list)
from recovery-manager)
4) These new apis can be added to the TaskTrackerManager interface as there really are tasktracker
level actions. 

> Ability to blacklist tasktracker
> --------------------------------
>                 Key: HADOOP-5643
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5643
>             Project: Hadoop Core
>          Issue Type: New Feature
>    Affects Versions: 0.20.0
>            Reporter: Rajiv Chittajallu
>            Assignee: Amar Kamat
> Its not always possible to shutdown the tasktracker to stop scheduling tasks on the node.
(eg you can't login to the node but the TT is up). 
> This can be via 
>   * mapred.exclude and should be refreshed with out restarting the tasktracker
>   * hadoop job -fail-tracker <tracker id>

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message