hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nitin Pawar <nitinpawar...@gmail.com>
Subject Re: Automatically mapping a job submitted by a particular user to a specific hadoop map-reduce queue
Date Thu, 25 Apr 2013 10:04:55 GMT
the current capacity scheduler guarantees that which users can submit jobs
to which queue and other related features.
More of which you can read at

but on the hive side, unless you set mapred.job.queue.name on the hive cli,
they will be submitted to default job queue.

So basically what you would like to do is create user, associate it with a
queue on scheduler and ask the user to modify its queue on local hiverc

I am not sure if this can be part of hive's metastore. Because one user can
be allowed to submit the job to multiple queues and then best way to handle
it is via setting the property each time you open the session or via hiverc

On Thu, Apr 25, 2013 at 12:11 PM, Sandy Ryza <sandy.ryza@cloudera.com>wrote:

> Hi Sagar,
> This capability currently does not exist in the fair scheduler (or other
> schedulers, as far as I know), but a JIRA has been filed recently that
> addresses a similar need.   Would
> https://issues.apache.org/jira/browse/MAPREDUCE-5132 work for what you're
> trying to do?  If not, would you mind filing a new JIRA for the
> functionality you'd want?
> -Sandy
> On Wed, Apr 24, 2013 at 6:22 PM, Sagar Mehta <sagarmehta@gmail.com> wrote:
>> Hi Guys,
>> We have a general purpose Hive cluster [about 200 nodes] which is used
>> for various jobs like
>>    - Production
>>    - Experimental/Research
>>    - Adhoc queries
>> We are using the fair-share scheduler to schedule them and for this we
>> have corresponding 3 pools in the scheduler.
>> *Here is what we want.*
>> *A hive query submitted by a user with user-name A should go to one of
>> the pools above based on a pre-defined mapping. We are wondering where/how
>> to specify this mapping?*
>> *We can do this manually by adding -Dmapred.job.queue.name="X" on a
>> particular job run.*
>> This puts the job on the map-reduce queue named "X" and the following
>> configuration in the fair-share scheduler
>>   <property>
>>     <name>mapred.fairscheduler.poolnameproperty</name>
>>     <value>mapred.job.queue.name</value>
>>   </property>
>> maps this to a pool named "X" in the fair-share scheduler.
>> However we [while wearing our Hadoop developer/admin hat] don't want the
>> user/analyst to specify that so as to enforce some cluster-use policy.
>> Based on his/her username we want to automatically select which hadoop
>> queue and subsequently which fair-share scheduler pool, his/her job should
>> go to. I'm pretty sure this is a common use-case and wondering how to do
>> this in Hadoop.
>> Any help/insights/pointers would be greatly appreciated.
>> Sagar
>> PS - Btw we are using Cloudera cdh3u2 and the user jobs are Hive queries.

Nitin Pawar

View raw message