hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hemanth Yamijala (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-1118) Capacity Scheduler scheduling information is hard to read / should be tabular format
Date Fri, 13 Aug 2010 10:21:19 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898174#action_12898174
] 

Hemanth Yamijala commented on MAPREDUCE-1118:
---------------------------------------------

I took this patch for a spin. Without going into too much detail of the code, I could see
some high level points to discuss:

- The /scheduler page that is being added does not have all the fields for the queues - it
only has fields related to capacity parameters. The queueinfo page on the other hand has more
fields - like tasks, limits, job counts etc . I would imagine we'll need more information
in the servlet, no ? Allen ?

- The patch doesn't play well with hierarchical queues introduced in Hadoop 0.21 (MAPREDUCE-853).
When I configured a simple hierarchy: parent level queues: p1 and p2, each having one child
queue: q1 and q2 respectively, I got the following exception when I accessed /scheduler: 

{code}
java.lang.NullPointerException
	at org.apache.hadoop.mapred.CapacitySchedulerServlet.showQueues(CapacitySchedulerServlet.java:127)
	at org.apache.hadoop.mapred.CapacitySchedulerServlet.doGet(CapacitySchedulerServlet.java:90)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
{code}

More importantly, I am not even sure *what* should be displayed in the case of hierarchical
queues that makes sense to meet Allen's original requirement - i.e. provide an easy-to-read
interface to compare queue information. The confusion clearly is that queues across hierarchies
do not make sense to be compared (unless the information is normalized at some global level).
So, probably what makes sense is to have this kind of tabular structure for queues at every
level. Clearly, this issue does not arise for Hadoop 0.20.

- From an interface point of view, I think we can do better in how scheduling information
is accessed from the main page. This information has been available via the 'queues' link,
and this patch adds another entry point - the /scheduler page. Perhaps discussion around the
hierarchical queue interface will give us ideas around this as well.

> Capacity Scheduler scheduling information is hard to read / should be tabular format
> ------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1118
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1118
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.20.2
>            Reporter: Allen Wittenauer
>            Assignee: Krishna Ramachandran
>         Attachments: mapred-1118-1.patch, mapred-1118-2.patch, mapred-1118-3.patch, mapred-1118.20S.patch,
mapred-1118.patch
>
>
> The scheduling information provided by the capacity scheduler is extremely hard to read
on the job tracker web page.  Instead of just flat text, it should be presenting the information
in a tabular format, similar to what the fair share scheduler provides.  This makes it much
easier to compare what different queues are doing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message