hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "rahul k singh (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-861) Modify queue configuration format and parsing to support a hierarchy of queues.
Date Sun, 30 Aug 2009 18:53:32 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12749304#action_12749304
] 

rahul k singh commented on MAPREDUCE-861:
-----------------------------------------

Attaching the new patch with above comments taken care of , except few of them:

bq.  createHierarchy should be refactored to parse leaf level queues and non leaf level queues
separately. I would suggest something along the lines of:
- The existing implementation is better , as it traverses much lesser time than the suggestion
above.

bq. In QueueManager.initialize, allQueues.putAll(leafQueues) seems unnecessary, since getInnerQueues
already does this.
- getInnerQueues returns the list of inner queues . i.e . queues which have children. Hence
the union of leafQueues and getInnerQueues would be the complete list. However in the deprecated
conf case , leafQueue is sufficient.

bq. I assume the order of child elements of an XML tag will be in the order in which they
are defined in the file. I think we must explicitly ensure that elements come in a fixed order.
For e.g. currently, we are not enforcing an order on how tags under the 'queue' node are defined.
This could lead to cases where if a 'queue' element comes before the 'name' element, we would
call parse hierarchy for the child queue without the parent's name.
- Done, but in different way. We still don't enforce the order ,but we make sure in code that
parent is fully created before there is an attempt to create
the child. Hence instead of depth first traversing we are now doing breadth first traversing.

bq. getInnerQueues can be renamed as getDescendentQueues
- getInnerQueus method intent is to return list of queues which have children.

Discussion points :
bq. Do we need leafQueues and allQueues stored separately ?
- There are more than one occasions where we need only leafQueues to be used and more than
one for only allQueues.
Initial thought was to maintain the list of inner queues(i.e , queues which always have children)
 and leaf queues and generate there union
whenever we need to access all the queues. Decided against it , as cost of putAll()(which
we will call while union) would be costly if we need to do that everytime and also The cost
of extra storage of leafQueues is not very high.

bq. Should we validate using an XSD ? I know you were planning on this. After looking at the
code, I am thinking this makes good sense.
- Yes i agree with this.

bq. I am unable to decide if refreshConfiguration should be in the base class for QueueHierarchyBuilder,
because it would need a Configuration in the deprecated case, and none in the new case.
- As of now we have single method refreshConfiguration(Configuration conf) and we pass null
in case of HierarchyQueueBuilder class.

bq. We need a test case to test default values in mapred-queues.xml, which will just have
the default queue, with no children and no properties, and acls disabled.
bq. dumpConfiguration is not yet done , will add 1 more patch to cover that.

- Wiil do above 2 and upload the new patch.




> Modify queue configuration format and parsing to support a hierarchy of queues.
> -------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-861
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-861
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>            Reporter: Hemanth Yamijala
>            Assignee: rahul k singh
>         Attachments: MAPREDUCE-861-1.patch
>
>
> MAPREDUCE-853 proposes to introduce a hierarchy of queues into the Map/Reduce framework.
This JIRA is for defining changes to the configuration related to queues. 
> The current format for defining a queue and its properties is as follows: mapred.queue.<queue-name>.<property-name>.
For e.g. mapred.queue.<queue-name>.acl-submit-job. The reason for using this verbose
format was to be able to reuse the Configuration parser in Hadoop. However, administrators
currently using the queue configuration have already indicated a very strong desire for a
more manageable format. Since, this becomes more unwieldy with hierarchical queues, the time
may be good to introduce a new format for representing queue configuration.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message