hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3930) Decide how to integrate scheduler info into CLI and job tracker web page
Date Wed, 17 Sep 2008 18:53:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12631876#action_12631876

Owen O'Malley commented on HADOOP-3930:

This is getting close, but I have a few suggestions.

When I asked you to split the queue queries out of JobClient, I didn't think about the API.
I think the API is better in JobClient and JobQueueClient is only about the main that supports
the cli commands. JobQueueClient shouldn't be a public class, because otherwise it ends up
in the public API. So the API access is still through JobClient and JobQueueClient implements
Tool, etc.

Let's make JobSubmissionProtocol.getJobQueueInfos to getQueues(). getJobQueueInfo should be

The methods in JobQueueClient should be public, moved to JobClient, and renamed:
getAllQueueSchedulingInfo -> JobClient.getQueues()
getAllJobs -> JobClient.getJobsFromQueue(queueName)
getQueueSchedulingInfo -> JobClient.getQueueInfo(queueName)

mapred.JSPUtil should *not* be public.

Several of the new public API classes and methods are missing javadoc.

JobQueueInfo.schedulerInfo should be a string, rather than an object. Since the serialization
forces it to be a string, it should just be typed/stored that way. The QueueManager should
probably have a map like:
  Map<String, Object> schedulerInfo; // map from queue name to scheduler specific object
and just create the JobQueueInfo when the JobSubmissionProtocol methods are called. The constructor
should take the two strings and don't bother with the setSchedulerInfo.

I'm not very happy with ClientUtil. It seems like a weak abstraction. Is it really necessary,
especially if you fold back into JobClient?

> Decide how to integrate scheduler info into CLI and job tracker web page
> ------------------------------------------------------------------------
>                 Key: HADOOP-3930
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3930
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Matei Zaharia
>            Assignee: Sreekanth Ramakrishnan
>             Fix For: 0.19.0
>         Attachments: 3930-1.patch, HADOOP-3930-2.patch, HADOOP-3930-3.patch, HADOOP-3930-4.patch,
HADOOP-3930-5.patch, HADOOP-3930-6.patch, HADOOP-3930-7.patch, HADOOP-3930-8.patch, HADOOP-3930-9.patch,
> We need a way for job schedulers such as HADOOP-3445 and HADOOP-3476 to provide info
to display on the JobTracker web interface and in the CLI. The main things needed seem to
> * A way for schedulers to provide info to show in a column on the web UI and in the CLI
- something as simple as a single string, or a map<string, int> for multiple parameters.
> * Some sorting order for jobs - maybe a method to sort a list of jobs.
> Let's figure out what the best way to do this is and implement it in the existing schedulers.
> My first-order proposal at an API: Augment the TaskScheduler with
> * public Map<String, String> getSchedulingInfo(JobInProgress job) -- returns key-value
pairs which are displayed in columns on the web UI or the CLI for the list of jobs.
> * public Map<String, String> getSchedulingInfo(String queue) -- returns key-value
pairs which are displayed in columns on the web UI or the CLI for the list of queues.
> * public Collection<JobInProgress> getJobs(String queueName) -- returns the list
of jobs in a given queue, sorted by a scheduler-specific order (the order it wants to run
them in / schedule the next task in / etc).
> * public List<String> getQueues();

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message