hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sagar Mehta <sagarme...@gmail.com>
Subject Automatically mapping a job submitted by a particular user to a specific hadoop map-reduce queue
Date Thu, 25 Apr 2013 01:22:57 GMT
Hi Guys,

We have a general purpose Hive cluster [about 200 nodes] which is used for
various jobs like

   - Production
   - Experimental/Research
   - Adhoc queries

We are using the fair-share scheduler to schedule them and for this we have
corresponding 3 pools in the scheduler.

*Here is what we want.*

*A hive query submitted by a user with user-name A should go to one of the
pools above based on a pre-defined mapping. We are wondering where/how to
specify this mapping?*

*We can do this manually by adding -Dmapred.job.queue.name="X" on a
particular job run.*

This puts the job on the map-reduce queue named "X" and the following
configuration in the fair-share scheduler


maps this to a pool named "X" in the fair-share scheduler.

However we [while wearing our Hadoop developer/admin hat] don't want the
user/analyst to specify that so as to enforce some cluster-use policy.

Based on his/her username we want to automatically select which hadoop
queue and subsequently which fair-share scheduler pool, his/her job should
go to. I'm pretty sure this is a common use-case and wondering how to do
this in Hadoop.

Any help/insights/pointers would be greatly appreciated.

PS - Btw we are using Cloudera cdh3u2 and the user jobs are Hive queries.

View raw message