hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vivek Ratan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3479) Implement configuration items useful for Hadoop resource manager (v1)
Date Fri, 20 Jun 2008 04:03:46 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12606628#action_12606628

Vivek Ratan commented on HADOOP-3479:

Doug, when we (mostly Hemanth and I) thought about configuration for queues, we felt we had
a somewhat unique situation: the number of queues is dynamic, and each queue has different
values for a common set of attributes . We felt we had two fundamental implementation choices:

* use the 'flatness' of the existing Configuration class and config format to reflect a more
hierarchical config structure that queues inherently need (by 'flatness', I mean that we don't
have an obvious way today to express hierarchies or a dynamic number of entries). 
* extend the Configuration class and its supporting classes to handle hierarchies and dynamic
number of properties

We looked hard at the first option and considered some strategies, including some of the ones
you've mentioned. In fact, I sent out a mail to code-dev on Tue, 6/3, suggesting an option
of having a property that lists comma-separated queue names or number of queues, and then
building the property name string for each queue attribute and getting its value from the
config file. The pros are obvious: we use the existing framework and do not duplicate code.
But we felt like the configuration framework needed to fundamentally handle hierarchies and
dynamic number of properties. Arguably, it's only for queues today, but later, if/when we
support hierarchies of Orgs/queues/users and perhaps overwriting of default values (lower-level
entities in the hierarchies can override higher-level defaults; for example, queues can specify
some defaults for users which some users can override), we will need such functionality. 

Now, if we need this functionality, we felt we had two options: 
* we could alter the Configuration class to support the new features, then perhaps have a
QueueConf subclass, just like we have one for JobConf. 
* we could build this functionality in the QueueConf class separately to see how well it works

There is no doubt that the first approach is a better longer-term solution. However, we didn't
really want to change the Configuration class too much at this stage. It's a core class, and
given that this whole stuff about queues and orgs is fairly new and we don't know how much
it will change over time, we felt that we should restrict our modifications to the QueueConfig
class for now. This isolates the more stable Configuration class from too many changes. It's
also something we can do faster. Once we feel we have it right, we do want to do the right
thing long-term, which is to build support in the Configuration class. Furthermore, people
haven't responded very much to our proposal, and given that configuration is usually one of
those areas where there are lots of string views, we wanted to keep the code impact minimal,
pending further discussions. The flip side is duplication of code, which you have pointed
out. And there is always the danger that once things get into the code base, they're often
not modified according to their original intent, but that's something we need to be disciplined
about. But eventually, we do want to go with the first option.

I'm personally not against using the current flat config structure to handle queues, at least
until we have more use cases for hierarchical configuration. But I think that by limiting
the new code to a separate class, and the new configuration to a separate file, we isolate
ourselves against too many changes to code in the future, till we get our use cases right.
And the configuration format that Hemanth has proposed is more compact and easier to understand
than having a separate property for every attribute of every queue. 

These are the reasons for our proposal. We'd love to hear more. Hemanth's been soliciting
comments for a while :) 

> Implement configuration items useful for Hadoop resource manager (v1)
> ---------------------------------------------------------------------
>                 Key: HADOOP-3479
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3479
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>         Attachments: 3479.1.patch, 3479.patch
> HADOOP-3421 lists requirements for a new resource manager for Hadoop. Implementation
for these will require support for new configuration items in Hadoop. This JIRA is to define
such configuration, and track it's implementation.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message