hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hemanth Yamijala (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-860) Modify Queue APIs to support a hierarchy of queues
Date Fri, 14 Aug 2009 07:45:15 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743121#action_12743121

Hemanth Yamijala commented on MAPREDUCE-860:

One proposal is to use a 'fully qualified queue name', which is a separated list of queue
names that uniquely defines a queue in the hierarchy. Let's say the separator character is
'-' (subject to discussion). So, this is an example of a fully qualified queue name - "org1-priority-production".
If a fully qualified name is not specified, this will map to a top level queue. This definition
is backwards compatible for existing sites, because top level queues and leaf level queues
are the same in this case.

With this in mind, the APIs can be redefined as follows:

- getJobsFromQueue returns jobs from the named queue. New sites that use hierarchical queues
must pass the fully qualified queue name of the job queue. In case a container queue name
is passed to the API, we could throw an IOException with an explanatory message. The other
option is to return null, but then it would be difficult to know whether it is because there
are no jobs in the queue, or due to an error condition.
- getQueueInfo works as before.
- For getQueues we have two choices. We can:
-- return only information about leaf level queues, or job queues.
-- return information about all queues.
Either approach could be considered backwards compatible as for sites with leaf level queues,
both approaches will return the same list of queues.

The JobQueueInfo should have the queueName as the fully qualified name of the queue. 

We would also need new APIs to navigate the queue hierarchy. There are a couple of options
to do this. 
- One is to maintain the hierarchy using JobQueueInfo objects. i.e. a JobQueueInfo object
will contain the JobQueueInfo objects for its children. The advantage is that with a single
call we can get the entire hierarchy and this will allow efficient navigation and fewer lookups
when we want to get the entire tree. The disadvantage is the amount of data transferred if
we are interested only in part of the hierarchy. This change though would mean the JobQueueInfo
class itself is not backwards compatible and though behavior can be maintained, client code
will need to use the new JobClient code.

- The other would be to provide APIs such as the following:
// return information about queues that form the roots of the hierarchy
JobQueueInfo[] getRootQueues();
// return information about queues that are under a given queue
JobQueueInfo[] getChildQueues(String queueName);
This would allow traversal of the entire tree and also of portions of the hierarchy. However,
if the interest is in all the queues, it is a lot of RPC calls, and lookups on the server.
- We could do both - i.e. have an API such as JobQueueInfo[] getAllQueues() that returns the
entire tree, and the more specific calls to navigate just parts of the hierarchy. 

Thoughts ?

> Modify Queue APIs to support a hierarchy of queues
> --------------------------------------------------
>                 Key: MAPREDUCE-860
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-860
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: jobtracker
>            Reporter: Hemanth Yamijala
>            Assignee: rahul k singh
> MAPREDUCE-853 proposes to introduce a hierarchy of queues into the Map/Reduce framework.
This JIRA is for defining changes to the APIs related to queues.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message