hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From praveenesh kumar <praveen...@gmail.com>
Subject Re: Understanding fair schedulers
Date Wed, 25 Jan 2012 15:19:29 GMT
Then in that case, will I be using group name tag in allocations file, like
this inside each pool ?

< group name="ABC">
    <maxRunningJobs>6</maxRunningJobs>
  </group>

Thanks,
Praveenesh

On Wed, Jan 25, 2012 at 8:08 PM, Harsh J <harsh@cloudera.com> wrote:

> A solution would be to place your users into groups, and use
> group.name identifier to be the  poolnameproperty. Would this work for
> you instead?
>
> On Wed, Jan 25, 2012 at 8:00 PM, praveenesh kumar <praveenesh@gmail.com>
> wrote:
> > Also, with the above mentioned method, my problem is I am having one
> > pool/user (thats obviously not a good way of configuring schedulers)
> > How can I allocate multiple users to one pool in the xml properties, so
> > that I don't have to care giving any options inside my codes.
> >
> > Thanks,
> > Praveenesh
> >
> > On Wed, Jan 25, 2012 at 7:55 PM, praveenesh kumar <praveenesh@gmail.com
> >wrote:
> >
> >> I am looking for the solution where we can do it permanently without
> >> specify these things inside jobs.
> >> I want to keep these things hidden from the end-user.
> >> End-user would just write pig scripts and all the jobs submitted by the
> >> particular user will get submit to their respective pools automatically.
> >>
> >> What I am doing write now is something like this
> >>
> >>  <allocations>
> >>   <pool name="ABC">
> >>     <minMaps>10</minMaps>
> >>     <minReduces>10</minReduces>
> >>     <maxMaps>192</maxMaps>
> >>     <maxReduces>96</maxReduces>
> >>     <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> >>   </pool>
> >>   <user name="ABC">
> >>
> >>     <maxRunningJobs>6</maxRunningJobs>
> >>   </user>
> >>   <userMaxJobsDefault>3</userMaxJobsDefault>
> >>   <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> >>
> >>   <pool name="XYZ">
> >>     <minMaps>10</minMaps>
> >>     <minReduces>10</minReduces>
> >>     <maxMaps>192</maxMaps>
> >>     <maxReduces>96</maxReduces>
> >>     <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> >>   </pool>
> >>   <user name="XYZ">
> >>
> >>    <maxRunningJobs>6</maxRunningJobs>
> >>   </user>
> >>   <userMaxJobsDefault>3</userMaxJobsDefault>
> >>   <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> >>
> >> </allocations>
> >>
> >> By doing this, I am able to see different pools per user, without
> >> mentioning anything inside the jobs.
> >> Automatically jobs are going to the respective pools.
> >>
> >> But what I wanted to know , is this the right method to do ?
> >>
> >> Thanks,
> >> Praveenesh
> >>
> >>
> >>
> >> On Wed, Jan 25, 2012 at 7:36 PM, Harsh J <harsh@cloudera.com> wrote:
> >>
> >>> Set the property in Pig with the 'set' command or other ways:
> >>> http://pig.apache.org/docs/r0.9.1/cmds.html#set or
> >>> http://pig.apache.org/docs/r0.9.1/start.html#properties
> >>>
> >>> As Srinivas covered earlier, pool allocation can be done per-user if
> >>> you set the scheduler poolnameproperty to "user.name". Per group if
> >>> you set the property to "group.name".
> >>>
> >>> Then you can provide per-poolname config overrides via the "pool"
> >>> element config described in
> >>>
> >>>
> http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
> >>>
> >>> On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar <
> praveenesh@gmail.com>
> >>> wrote:
> >>> > I am running pig jobs, how can I specify on which pool, it should
> run ?
> >>> > Also do you mean, the pool allocation is done job wise, not user
> wise ?
> >>> >
> >>> >
> >>> > On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani <vasajb@gmail.com
> >
> >>> wrote:
> >>> >
> >>> >> Praveenesh,
> >>> >>
> >>> >> You can try specifying "mapred.fairscheduler.pool" to your pool
name
> >>> while
> >>> >> running the job. By default, mapred.faircheduler.poolnameproperty
> set
> >>> to
> >>> >> user.name ( each job run by user is allocated to his named pool
)
> and
> >>> you
> >>> >> can also change this property to group.name.
> >>> >>
> >>> >> Srinivas --
> >>> >>
> >>> >> Also, you can set
> >>> >>
> >>> >> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <
> >>> praveenesh@gmail.com
> >>> >> >wrote:
> >>> >>
> >>> >> > Understanding Fair Schedulers better.
> >>> >> >
> >>> >> > Can we create mulitple pools in Fair Schedulers. I guess Yes.
> Please
> >>> >> > correct me.
> >>> >> >
> >>> >> > Suppose I have 2 pools in my fair-scheduler.xml
> >>> >> >
> >>> >> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce :
10, Max
> >>> >> Reduce :
> >>> >> > 50
> >>> >> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20,
Max
> >>> Reduce :
> >>> >> > 80
> >>> >> >
> >>> >> > I have 5 users, who will be using these pools. How will I
allocate
> >>> >> specific
> >>> >> > pools to specific users ?
> >>> >> >
> >>> >> > Suppose I want user1,user2 to use "Hadoop-users" pool and
> >>> >> user3,user4,user5
> >>> >> > to use "Admin users"
> >>> >> >
> >>> >> > In
> >>> http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
> >>> >> > they have mentioned allocations something like this.
> >>> >> >
> >>> >> > <?xml version="1.0"?>
> >>> >> > <allocations>
> >>> >> >  <pool name="sample_pool">
> >>> >> >    <minMaps>5</minMaps>
> >>> >> >    <minReduces>5</minReduces>
> >>> >> >    <maxMaps>25</maxMaps>
> >>> >> >    <maxReduces>25</maxReduces>
> >>> >> >    <minSharePreemptionTimeout>300</minSharePreemptionTimeout>
> >>> >> >  </pool>
> >>> >> >  <user name="sample_user">
> >>> >> >    <maxRunningJobs>6</maxRunningJobs>
> >>> >> >  </user>
> >>> >> >  <userMaxJobsDefault>3</userMaxJobsDefault>
> >>> >> >  <fairSharePreemptionTimeout>600</fairSharePreemptionTimeout>
> >>> >> > </allocations>
> >>> >> >
> >>> >> > I tried creating more pools, its happening, but how to allocate
> >>> users to
> >>> >> > use specific pools ?
> >>> >> >
> >>> >> > Thanks,
> >>> >> > Praveenesh
> >>> >> >
> >>> >>
> >>>
> >>>
> >>>
> >>> --
> >>> Harsh J
> >>> Customer Ops. Engineer, Cloudera
> >>>
> >>
> >>
>
>
>
> --
> Harsh J
> Customer Ops. Engineer, Cloudera
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message