hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hemanth Yamijala (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5262) Allow specifying min shares as percentage of cluster
Date Mon, 16 Feb 2009 11:14:59 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12673849#action_12673849

Hemanth Yamijala commented on HADOOP-5262:

Matei, the fixed number of slots use case makes sense. So, I guess we need both modes. 

I am considering how this would compare to having a separate config variable for expressing
this as a percentage. i.e we could either specify minMaps and minReduces, or minCapacityPercent
in the config file. 

The advantages are:

- I think it is a little simpler to manage. If someone misses a percentage symbol without
intending to, it could lead to some weird results in the earlier suggestion because it would
still be a valid value for the configuration. And like I mentioned, one can add up the percentages
and do some simple check that it doesn't exceed 100 and so on.
- I don't see a need of having two separate variables in the percentage mode. Like you've
mentioned in the description, groups pay for X% of the cluster capacity, and generally not
X% maps and Y% reduces. So, setting up two numbers, which almost always are going to be the
same doesn't seem to be required.

The disadvantage I see is:
- One more configuration variable
- And what happens if both are specified. We could either have one of the two modes take precendence,
or just error out. But either way, the semantics should be decided.

In spite of the disadvantages, given how central this, I would favor separate configuration

Thoughts ?

> Allow specifying min shares as percentage of cluster
> ----------------------------------------------------
>                 Key: HADOOP-5262
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5262
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/fair-share
>            Reporter: Matei Zaharia
>            Priority: Minor
> Currently the guaranteed shares for pools in the fair scheduler are specified as a number
of slots. For organizations where a group pays X% of the cluster and the actual number of
nodes in the cluster varies due to failures, expansion, etc over time, it would be useful
to support a guaranteed share given as a percentage too. This would just let you write in
the config file something like <minMaps>5%</minMaps> instead of <minMaps>42</minMaps>.
The scheduler would need to recompute what this means in terms of number of slots on every
update (probably through some kind of update(ClusterStatus) method in PoolManager).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message