hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Abdelnur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3740) Make JobInProgress pluggable
Date Mon, 08 Sep 2008 06:44:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12629085#action_12629085

Alejandro Abdelnur commented on HADOOP-3740:

Given the current {{TaskScheduler}} API:

* It requires to duplicate all the scheduler code, specifically the {{assignTask}} method
just to do a check before calling the {{JobInProgress.obtainNew*Task}}.

With the patch approach the throttling of tasks based on 'license availability' can be used
for different scheduler implementations.

* The {{TaskScheduler}} does not get notified when the task ends (due to failure or success).

This is not possible to do as this is done in the {{JobInProgress.complete.updateTaskStatus()}}

Maybe there should be something like a {{TaskListener}} with {{start}} {{update}} {{finish}}
methods and the {{start}} call should be able to veto the task.

Making the {{JobInProgress}} pluggable is a temporary solution, more specifically to be able
to solve the second bullet.

> Make JobInProgress pluggable
> ----------------------------
>                 Key: HADOOP-3740
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3740
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>         Environment: all
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>         Attachments: patch3740.txt
> By allowing a pluggable JobInProgess it will be possible for provide implementations
that can do a sophisticated task provisioning to the JobTracker. 
> For example, by providing alternate implementations of the {{obtainNewMapTask}}, {{obtainNewReduceTask}}
and {{updateTaskInProgress}} it would be possible to implement a license server that allows
to throttle use of external resources (ie webservices, databases) so at any given time there
are not more than N tasks using a given resource. For this a task could be tagged with the
names of external resources and the license server would keep track of the tasks running per
tag, if the counter reaches zero then the {{obtainNew*Task}} method could return NULL instead
of a task.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message