hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matei Zaharia (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-707) Provide a jobconf property for explicitly assigning a job to a pool
Date Tue, 03 Nov 2009 08:09:59 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12772933#action_12772933
] 

Matei Zaharia commented on MAPREDUCE-707:
-----------------------------------------

I don't think that does what you want it to do, because it just sets poolNameProperty to a
different value. What you want is for each *individual* job's pool to be determined based
on which properties are in its JobConf (if it has mapred.fairscheduler.pool, use that; otherwise
use whatever property poolNameProperty is set to; and if that doesn't exist in the JobConf
either, use DEFAULT_POOL_NAME). The XML code I posted makes the job's JobConf implement this
logic, by having the "pool" property be whatever the user sets it to if the user provides
a setting, and making it default to the value of user.name otherwise. But there's no way to
achieve this same behavior by just changing PoolManager.initialize to set poolNameProperty
another way.

Having thought about this some more, I think the most elegant implementation is to actually
change PoolManger.getPoolName(JobInProgress job) to explicitly check whether the job's JobConf
has the key "mapred.fairscheduler.pool" set, and if so, return that value; otherwise, it should
return conf.get(poolNameProperty) as it does now. This should be a simple change to PoolManager.getPoolName.
You will also need to change PoolManager.setPool() to set the "mapred.fairscheduler.pool"
property rather than the poolNameProperty property on the job. Let me know if this makes sense.

> Provide a jobconf property for explicitly assigning a job to a pool
> -------------------------------------------------------------------
>
>                 Key: MAPREDUCE-707
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-707
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/fair-share
>            Reporter: Matei Zaharia
>            Priority: Trivial
>
> A common use case of the fair scheduler is to have one pool per user, but then to define
some special pools for various production jobs, import jobs, etc. Therefore, it would be nice
if jobs went by default to the pool of the user who submitted them, but there was a setting
to explicitly place a job in another pool. Today, this can be achieved through a sort of trick
in the JobConf:
> {code}
> <property>
>   <name>mapred.fairscheduler.poolnameproperty</name>
>   <value>pool.name</value>
> </property>
> <property>
>   <name>pool.name</name>
>   <value>${user.name}</value>
> </property>
> {code}
> This JIRA proposes to add a property called mapred.fairscheduler.pool that allows a job
to be placed directly into a pool, avoiding the need for this trick.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message