hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sagar Mehta <sagarme...@gmail.com>
Subject Re: Automatically mapping a job submitted by a particular user to a specific hadoop map-reduce queue
Date Fri, 26 Apr 2013 17:42:47 GMT
Hi Vinod,

Yes this is exactly what we are doing right now which works but is manual
and exposes the policy.
I think the JIRA than Sandy pointed out -
https://issues.apache.org/jira/browse/MAPREDUCE-5132 is a good first step
in that direction.


On Thu, Apr 25, 2013 at 1:44 PM, Vinod Kumar Vavilapalli <
vinodkv@hortonworks.com> wrote:

> The 'standard' way to do this is using queu-acls to enforce a particular
> user to be able to submit jobs to a sub-set of queues and then let the user
> decide which of that subset of queues he wishes to submit a job to.
> Thanks,
> +Vinod Kumar Vavilapalli
> Hortonworks Inc.
> http://hortonworks.com/
> On Apr 24, 2013, at 6:22 PM, Sagar Mehta wrote:
> Hi Guys,
> We have a general purpose Hive cluster [about 200 nodes] which is used for
> various jobs like
>    - Production
>    - Experimental/Research
>    - Adhoc queries
> We are using the fair-share scheduler to schedule them and for this we
> have corresponding 3 pools in the scheduler.
> *Here is what we want.*
> *A hive query submitted by a user with user-name A should go to one of
> the pools above based on a pre-defined mapping. We are wondering where/how
> to specify this mapping?*
> *We can do this manually by adding -Dmapred.job.queue.name="X" on a
> particular job run.*
> This puts the job on the map-reduce queue named "X" and the following
> configuration in the fair-share scheduler
>   <property>
>     <name>mapred.fairscheduler.poolnameproperty</name>
>     <value>mapred.job.queue.name</value>
>   </property>
> maps this to a pool named "X" in the fair-share scheduler.
> However we [while wearing our Hadoop developer/admin hat] don't want the
> user/analyst to specify that so as to enforce some cluster-use policy.
> Based on his/her username we want to automatically select which hadoop
> queue and subsequently which fair-share scheduler pool, his/her job should
> go to. I'm pretty sure this is a common use-case and wondering how to do
> this in Hadoop.
> Any help/insights/pointers would be greatly appreciated.
> Sagar
> PS - Btw we are using Cloudera cdh3u2 and the user jobs are Hive queries.

View raw message