hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From praveenesh kumar <praveen...@gmail.com>
Subject Re: Understanding fair schedulers
Date Wed, 25 Jan 2012 14:30:37 GMT
Also, with the above mentioned method, my problem is I am having one
pool/user (thats obviously not a good way of configuring schedulers)
How can I allocate multiple users to one pool in the xml properties, so
that I don't have to care giving any options inside my codes.

Thanks,
Praveenesh

On Wed, Jan 25, 2012 at 7:55 PM, praveenesh kumar <praveenesh@gmail.com>wrote:

> I am looking for the solution where we can do it permanently without
> specify these things inside jobs.
> I want to keep these things hidden from the end-user.
> End-user would just write pig scripts and all the jobs submitted by the
> particular user will get submit to their respective pools automatically.
>
> What I am doing write now is something like this
>
>  <allocations>
>   <pool name="ABC">
>     <minMaps>10</minMaps>
>     <minReduces>10</minReduces>
>     <maxMaps>192</maxMaps>
>     <maxReduces>96</maxReduces>
>     <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>   </pool>
>   <user name="ABC">
>
>     <maxRunningJobs>6</maxRunningJobs>
>   </user>
>   <userMaxJobsDefault>3</userMaxJobsDefault>
>   <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>
>   <pool name="XYZ">
>     <minMaps>10</minMaps>
>     <minReduces>10</minReduces>
>     <maxMaps>192</maxMaps>
>     <maxReduces>96</maxReduces>
>     <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>   </pool>
>   <user name="XYZ">
>
>    <maxRunningJobs>6</maxRunningJobs>
>   </user>
>   <userMaxJobsDefault>3</userMaxJobsDefault>
>   <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>
> </allocations>
>
> By doing this, I am able to see different pools per user, without
> mentioning anything inside the jobs.
> Automatically jobs are going to the respective pools.
>
> But what I wanted to know , is this the right method to do ?
>
> Thanks,
> Praveenesh
>
>
>
> On Wed, Jan 25, 2012 at 7:36 PM, Harsh J <harsh@cloudera.com> wrote:
>
>> Set the property in Pig with the 'set' command or other ways:
>> http://pig.apache.org/docs/r0.9.1/cmds.html#set or
>> http://pig.apache.org/docs/r0.9.1/start.html#properties
>>
>> As Srinivas covered earlier, pool allocation can be done per-user if
>> you set the scheduler poolnameproperty to "user.name". Per group if
>> you set the property to "group.name".
>>
>> Then you can provide per-poolname config overrides via the "pool"
>> element config described in
>>
>> http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
>>
>> On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar <praveenesh@gmail.com>
>> wrote:
>> > I am running pig jobs, how can I specify on which pool, it should run ?
>> > Also do you mean, the pool allocation is done job wise, not user wise ?
>> >
>> >
>> > On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani <vasajb@gmail.com>
>> wrote:
>> >
>> >> Praveenesh,
>> >>
>> >> You can try specifying "mapred.fairscheduler.pool" to your pool name
>> while
>> >> running the job. By default, mapred.faircheduler.poolnameproperty set
>> to
>> >> user.name ( each job run by user is allocated to his named pool ) and
>> you
>> >> can also change this property to group.name.
>> >>
>> >> Srinivas --
>> >>
>> >> Also, you can set
>> >>
>> >> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <
>> praveenesh@gmail.com
>> >> >wrote:
>> >>
>> >> > Understanding Fair Schedulers better.
>> >> >
>> >> > Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
>> >> > correct me.
>> >> >
>> >> > Suppose I have 2 pools in my fair-scheduler.xml
>> >> >
>> >> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
>> >> Reduce :
>> >> > 50
>> >> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max
>> Reduce :
>> >> > 80
>> >> >
>> >> > I have 5 users, who will be using these pools. How will I allocate
>> >> specific
>> >> > pools to specific users ?
>> >> >
>> >> > Suppose I want user1,user2 to use "Hadoop-users" pool and
>> >> user3,user4,user5
>> >> > to use "Admin users"
>> >> >
>> >> > In
>> http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
>> >> > they have mentioned allocations something like this.
>> >> >
>> >> > <?xml version="1.0"?>
>> >> > <allocations>
>> >> >  <pool name="sample_pool">
>> >> >    <minMaps>5</minMaps>
>> >> >    <minReduces>5</minReduces>
>> >> >    <maxMaps>25</maxMaps>
>> >> >    <maxReduces>25</maxReduces>
>> >> >    <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
>> >> >  </pool>
>> >> >  <user name="sample_user">
>> >> >    <maxRunningJobs>6</maxRunningJobs>
>> >> >  </user>
>> >> >  <userMaxJobsDefault>3</userMaxJobsDefault>
>> >> >  <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
>> >> > </allocations>
>> >> >
>> >> > I tried creating more pools, its happening, but how to allocate
>> users to
>> >> > use specific pools ?
>> >> >
>> >> > Thanks,
>> >> > Praveenesh
>> >> >
>> >>
>>
>>
>>
>> --
>> Harsh J
>> Customer Ops. Engineer, Cloudera
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message