hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From praveenesh kumar <praveen...@gmail.com>
Subject Re: Understanding fair schedulers
Date Wed, 25 Jan 2012 14:25:33 GMT
I am looking for the solution where we can do it permanently without
specify these things inside jobs.
I want to keep these things hidden from the end-user.
End-user would just write pig scripts and all the jobs submitted by the
particular user will get submit to their respective pools automatically.

What I am doing write now is something like this

 <allocations>
  <pool name="ABC">
    <minMaps>10</minMaps>
    <minReduces>10</minReduces>
    <maxMaps>192</maxMaps>
    <maxReduces>96</maxReduces>
    <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
  </pool>
  <user name="ABC">
    <maxRunningJobs>6</maxRunningJobs>
  </user>
  <userMaxJobsDefault>3</userMaxJobsDefault>
  <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>

  <pool name="XYZ">
    <minMaps>10</minMaps>
    <minReduces>10</minReduces>
    <maxMaps>192</maxMaps>
    <maxReduces>96</maxReduces>
    <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
  </pool>
  <user name="XYZ">
   <maxRunningJobs>6</maxRunningJobs>
  </user>
  <userMaxJobsDefault>3</userMaxJobsDefault>
  <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>

</allocations>

By doing this, I am able to see different pools per user, without
mentioning anything inside the jobs.
Automatically jobs are going to the respective pools.

But what I wanted to know , is this the right method to do ?

Thanks,
Praveenesh


On Wed, Jan 25, 2012 at 7:36 PM, Harsh J <harsh@cloudera.com> wrote:

> Set the property in Pig with the 'set' command or other ways:
> http://pig.apache.org/docs/r0.9.1/cmds.html#set or
> http://pig.apache.org/docs/r0.9.1/start.html#properties
>
> As Srinivas covered earlier, pool allocation can be done per-user if
> you set the scheduler poolnameproperty to "user.name". Per group if
> you set the property to "group.name".
>
> Then you can provide per-poolname config overrides via the "pool"
> element config described in
>
> http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
>
> On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar <praveenesh@gmail.com>
> wrote:
> > I am running pig jobs, how can I specify on which pool, it should run ?
> > Also do you mean, the pool allocation is done job wise, not user wise ?
> >
> >
> > On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani <vasajb@gmail.com>
> wrote:
> >
> >> Praveenesh,
> >>
> >> You can try specifying "mapred.fairscheduler.pool" to your pool name
> while
> >> running the job. By default, mapred.faircheduler.poolnameproperty set to
> >> user.name ( each job run by user is allocated to his named pool ) and
> you
> >> can also change this property to group.name.
> >>
> >> Srinivas --
> >>
> >> Also, you can set
> >>
> >> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <praveenesh@gmail.com
> >> >wrote:
> >>
> >> > Understanding Fair Schedulers better.
> >> >
> >> > Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
> >> > correct me.
> >> >
> >> > Suppose I have 2 pools in my fair-scheduler.xml
> >> >
> >> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
> >> Reduce :
> >> > 50
> >> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max
> Reduce :
> >> > 80
> >> >
> >> > I have 5 users, who will be using these pools. How will I allocate
> >> specific
> >> > pools to specific users ?
> >> >
> >> > Suppose I want user1,user2 to use "Hadoop-users" pool and
> >> user3,user4,user5
> >> > to use "Admin users"
> >> >
> >> > In
> http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
> >> > they have mentioned allocations something like this.
> >> >
> >> > <?xml version="1.0"?>
> >> > <allocations>
> >> >  <pool name="sample_pool">
> >> >    <minMaps>5</minMaps>
> >> >    <minReduces>5</minReduces>
> >> >    <maxMaps>25</maxMaps>
> >> >    <maxReduces>25</maxReduces>
> >> >    <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> >> >  </pool>
> >> >  <user name="sample_user">
> >> >    <maxRunningJobs>6</maxRunningJobs>
> >> >  </user>
> >> >  <userMaxJobsDefault>3</userMaxJobsDefault>
> >> >  <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> >> > </allocations>
> >> >
> >> > I tried creating more pools, its happening, but how to allocate users
> to
> >> > use specific pools ?
> >> >
> >> > Thanks,
> >> > Praveenesh
> >> >
> >>
>
>
>
> --
> Harsh J
> Customer Ops. Engineer, Cloudera
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message