hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vivek Ratan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3740) Make JobInProgress pluggable
Date Mon, 14 Jul 2008 10:49:33 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613289#action_12613289

Vivek Ratan commented on HADOOP-3740:

bq. So the problem seems to be of resource constraints. [...]

I think it's trickier than that. The Hadoop scheduler is resource-aware, though in a very
simple way. A Map or Reduce slot is a resource, as is temp disk space (see HADOOP-657). The
tricky part is that the license server is a shared recourse, as you point out. It's not something
that's available within a TT, i.e., it's not a resource that the TT offers. We probably need
a different model to handle shared resources in the cluster. 

Alejandro, your best option for now is to do what you suggest - override _obtainNew*Task_.
Based on HADOOP-3445, we're adding hooks to override various steps in the Scheduling process
(picking the right queue, the right job, then the right task), but you need hooks within the
code that picks the right task. I'm not sure that can be done easily as you need to break
down the logic in _obtainNew*Task_ in a generic fashion and allow users to override part of
it dynamically. What you're suggesting looks to be the quickest way to get what you want,
though maybe not the most generic way (we'll need a few more use cases to figure that out).

> Make JobInProgress pluggable
> ----------------------------
>                 Key: HADOOP-3740
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3740
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>         Environment: all
>            Reporter: Alejandro Abdelnur
> By allowing a pluggable JobInProgess it will be possible for provide implementations
that can do a sophisticated task provisioning to the JobTracker. 
> For example, by providing alternate implementations of the {{obtainNewMapTask}}, {{obtainNewReduceTask}}
and {{updateTaskInProgress}} it would be possible to implement a license server that allows
to throttle use of external resources (ie webservices, databases) so at any given time there
are not more than N tasks using a given resource. For this a task could be tagged with the
names of external resources and the license server would keep track of the tasks running per
tag, if the counter reaches zero then the {{obtainNew*Task}} method could return NULL instead
of a task.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message