Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of jwfbean@cloudera.com designates
 209.85.223.175 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <EF3578F9CFB84227814B66A8DFE3BCFA@gmail.com>
References: 
 <CAJzooYeHeVSLJ=OauvVsUGv=cfmbQ7RnjSvBv_RQHnv6efqx_g@mail.gmail.com>
	<1899023892924472935@unknownmsgid>
	<8F12D028AC0A440081FAC86A2521A8F0@gmail.com>
	<CAJzooYfmioLrWA0avfAbc-Gzg5xY6PQrTLDaFEwFRHk6KfkRoA@mail.gmail.com>
	<9586818781574C62A4004A5FA69F118D@gmail.com>
	<EF3578F9CFB84227814B66A8DFE3BCFA@gmail.com>
Date: Wed, 16 Jan 2013 09:02:20 -0800
Message-ID: 
 <CAD=gu1p8QhNE0E6Qm9TiMFvDYUyTVaY-_kWB7HGjNDXh9wdzuA@mail.gmail.com>
Subject: Re: Fair Scheduler is not Fair why?
From: Jeff Bean <jwfbean@cloudera.com>
To: user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=e89a8f5029a4671bd204d36ad712

--e89a8f5029a4671bd204d36ad712
Content-Type: text/plain; charset=ISO-8859-1

Validate your scheduler capacity and behavior by using sleep jobs. Submit
sleep jobs to the pools that mirror your production jobs and just check
that the scheduler pool allocation behaves as you expect. The nice thing
about sleep is that you can mimic your real jobs: numbers of tasks and how
long they run.

You should be able to determine that the hypothesis posed on this thread is
correct: that all the slots are taken by other tasks. Indeed, your UI says
that research has 90 running tasks after having completed over 4000, but
your emails says no tasks are scheduled. I'm a little confused.

Jeff

On Wed, Jan 16, 2013 at 8:50 AM, Nan Zhu <zhunansjtu@gmail.com> wrote:

> BTW, what I mentioned is fairsharepreemption  not minimum share
>
> an alternative way to achieve that is to set minimum share of two queues
> to be equal(or other allocation scheme you like), and sum of them is equal
> to the capacity of the cluster, and enable minimumSharePreemption
>
> Good Luck!
>
> Best,
>
> --
> Nan Zhu
> School of Computer Science,
> McGill University
>
>
> On Wednesday, 16 January, 2013 at 11:43 AM, Nan Zhu wrote:
>
>  I think you should do that, so that when the allocation is inconsistent
> with fair share, the tasks in the queue which occupies more beyond it's
> fair share will be killed, and the available slots would be assigned to the
> other one (assuming the weights of them are the same)
>
> Best,
>
> --
> Nan Zhu
> School of Computer Science,
> McGill University
>
>
> On Wednesday, 16 January, 2013 at 11:32 AM, Dhanasekaran Anbalagan wrote:
>
> HI Nan,
>
> We have not enabled Fair Scheduler Preemption.
>
> -Dhanasekaran.
>
> Did I learn something today? If not, I wasted it.
>
>
> On Wed, Jan 16, 2013 at 11:21 AM, Nan Zhu <zhunansjtu@gmail.com> wrote:
>
>  have you enabled task preemption?
>
> Best,
>
> --
> Nan Zhu
> School of Computer Science,
> McGill University
>
>
> On Wednesday, 16 January, 2013 at 10:45 AM, Justin Workman wrote:
>
> Looks like weight for both pools is equal and all map slots are used.
> Therefore I don't believe anyone has priority for the next slots. Try
> setting research weight to 2. This should allow research to take slots as
> tech released them.
>
> Sent from my iPhone
>
> On Jan 16, 2013, at 8:26 AM, Dhanasekaran Anbalagan <bugcy013@gmail.com>
> wrote:
>
>  HI Guys
>
> We configured fair scheduler with cdh4, Fair scheduler not work properly.
> Map Task Capacity = 1380
> Reduce Task Capacity = 720
>
> We create two users tech and research, we configured equal weight 1 But, I
> stared job in research user mapper will not allocated why?
> please guide me guys.
>
> <?xml version="1.0"?>
> <allocations>
> <pool name="tech">
>   <minMaps>5</minMaps>
>   <minReduces>5</minReduces>
>   <maxRunningJobs>30</maxRunningJobs>
>   <weight>1.0</weight>
> </pool>
> <pool name="research">
>   <minMaps>5</minMaps>
>   <minReduces>5</minReduces>
>   <maxRunningJobs>30</maxRunningJobs>
>   <weight>1.0</weight>
> </pool>
> </allocations>
>
> Note: we have tested with Hadoop Stream job.
>
> Fair Scheduler Administration Pools PoolRunning JobsMap TasksReduce TasksScheduling
> Mode Min ShareMax ShareRunningFair ShareMin ShareMax ShareRunningFair
> Share research15-90690.05-00.0FAIR tech35-1266690.05-2424.0FAIR default00-
> 00.00-00.0FAIR Running Jobs SubmittedJobIDUserNamePoolPriorityMap TasksReduce
> Tasks FinishedRunningFair ShareWeightFinishedRunningFair ShareWeight Jan
> 16, 08:51 job_201301071639_2118<http://172.16.30.122:50030/jobdetails.jsp?jobid=job_201301071639_2118>
> tech streamjob5335328828469969152.jar   30466 / 53724583313.5 1.0 0 / 240
> 0.0 1.0 Jan 16, 09:56 job_201301071639_2147<http://172.16.30.122:50030/jobdetails.jsp?jobid=job_201301071639_2147>
> research streamjob8832181817213433660.jar   4175 / 958190690.0 1.0 0 / 240
> 0.0 1.0 Jan 16, 10:01 job_201301071639_2148<http://172.16.30.122:50030/jobdetails.jsp?jobid=job_201301071639_2148>
> tech streamjob8773848575543653055.jar   1842 / 15484620313.5 1.0 0 / 240
> 0.0 1.0 Jan 16, 10:08 job_201301071639_2155<http://172.16.30.122:50030/jobdetails.jsp?jobid=job_201301071639_2155>
> tech counterfactualsim-prod.eagle-EagleDepthSignalDisabled-prod.eagle   387
> / 4506363.0 1.0 0 / 242424.0 1.0
>
> --
>
>
>
>
>
>
>
>
>

--e89a8f5029a4671bd204d36ad712
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Validate your scheduler capacity and behavior by using sleep jobs. Submit s=
leep jobs to the pools that mirror your production jobs and just check that=
 the scheduler pool allocation behaves as you expect. The nice thing about =
sleep is that you can mimic your real jobs: numbers of tasks and how long t=
hey run.<br>
<br>You should be able to determine that the hypothesis posed on this threa=
d is correct: that all the slots are taken by other tasks. Indeed, your UI =
says that research has 90 running tasks after having completed over 4000, b=
ut your emails says no tasks are scheduled. I&#39;m a little confused.=A0 <=
br>
<br>Jeff<br><br><div class=3D"gmail_quote">On Wed, Jan 16, 2013 at 8:50 AM,=
 Nan Zhu <span dir=3D"ltr">&lt;<a href=3D"mailto:zhunansjtu@gmail.com" targ=
et=3D"_blank">zhunansjtu@gmail.com</a>&gt;</span> wrote:<br><blockquote cla=
ss=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;pa=
dding-left:1ex">

                <div>BTW, what I mentioned is fairsharepreemption =A0not mi=
nimum share=A0</div><div><br></div><div>an alternative way to achieve that =
is to set minimum share of two queues to be equal(or other allocation schem=
e you like), and sum of them is equal to the capacity of the cluster, and e=
nable minimumSharePreemption</div>
<div><br></div><div>Good Luck!</div><div class=3D"im HOEnZb"><div><br></div=
><div>Best,</div>
                <div><div><br></div><div>--=A0</div><div>Nan Zhu</div><div>=
School of Computer Science,</div><div>McGill University</div><div><br></div=
><div><br></div></div>
                =20
                </div><div class=3D"HOEnZb"><div class=3D"h5"><p style=3D"c=
olor:#a0a0a8">On Wednesday, 16 January, 2013 at 11:43 AM, Nan Zhu wrote:</p=
>
                <blockquote type=3D"cite" style=3D"border-left-style:solid;=
border-width:1px;margin-left:0px;padding-left:10px">
                    <span><div><div>
                <div>
                    I think you should do that, so that when the allocation=
 is inconsistent with fair share, the tasks in the queue which occupies mor=
e beyond it&#39;s fair share will be killed, and the available slots would =
be assigned to the other one (assuming the weights of them are the same)</d=
iv>
<div><br></div><div>Best,</div>
                <div><div><br></div><div>--=A0</div><div>Nan Zhu</div><div>=
School of Computer Science,</div><div>McGill University</div><div><br></div=
><div><br></div></div>
                 =20
                <p style=3D"color:#a0a0a8">On Wednesday, 16 January, 2013 a=
t 11:32 AM, Dhanasekaran Anbalagan wrote:</p><blockquote type=3D"cite"><div=
>
                    <span><div><div><div dir=3D"ltr">HI Nan,<div><br></div>=
<div>We have not enabled=A0<span>Fair Scheduler Preemption.</span></div><di=
v><span><br></span></div><div><span>-Dhanasekaran.</span></div></div>

<div><br clear=3D"all"><div>Did I learn something today? If not, I wasted i=
t.</div>
<br><br><div>On Wed, Jan 16, 2013 at 11:21 AM, Nan Zhu <span dir=3D"ltr">&l=
t;<a href=3D"mailto:zhunansjtu@gmail.com" target=3D"_blank">zhunansjtu@gmai=
l.com</a>&gt;</span> wrote:<br><blockquote type=3D"cite"><div>


                <div>
                    have you enabled task preemption?
                </div><div><br></div><div>Best,</div><span><font color=3D"#=
888888">
                <div><div><br></div><div>--=A0</div><div>Nan Zhu</div><div>=
School of Computer Science,</div><div>McGill University</div><div><br></div=
><div><br></div></div></font></span><div><div>


                  =20
                <p style=3D"color:#a0a0a8">On Wednesday, 16 January, 2013 a=
t 10:45 AM, Justin Workman wrote:</p><blockquote type=3D"cite"><div>
                    <span><div><div><div>Looks like weight for both pools i=
s equal and all map slots are used. Therefore I don&#39;t believe anyone ha=
s priority for the next slots. Try setting research weight to 2. This shoul=
d allow research to take slots as tech released them.=A0<br>


<br>Sent from my iPhone</div><div><br>On Jan 16, 2013, at 8:26 AM, Dhanasek=
aran Anbalagan &lt;<a href=3D"mailto:bugcy013@gmail.com" target=3D"_blank">=
bugcy013@gmail.com</a>&gt; wrote:<br><br></div><blockquote type=3D"cite"><d=
iv>


<div dir=3D"ltr">
<div>HI Guys</div><div><br></div><div>We configured=A0fair scheduler with c=
dh4, Fair scheduler not work properly.</div><div>Map Task Capacity =3D 1380=
<br></div><div>Reduce Task Capacity =3D 720</div>

<div><br></div><div>We create two users tech and research, we configured eq=
ual weight 1 But, I stared job in research user mapper will not allocated w=
hy?=A0</div><div>please guide me guys.</div><div><br></div>


<div>&lt;?xml version=3D&quot;1.0&quot;?&gt;</div><div>&lt;allocations&gt;<=
/div><div>&lt;pool name=3D&quot;tech&quot;&gt;<span style=3D"white-space:pr=
e-wrap">		</span></div><div>=A0 &lt;minMaps&gt;5&lt;/minMaps&gt;<span style=
=3D"white-space:pre-wrap">	</span> <span style=3D"white-space:pre-wrap">	</=
span></div>


<div>=A0 &lt;minReduces&gt;5&lt;/minReduces&gt;<span style=3D"white-space:p=
re-wrap">	</span></div><div>=A0 &lt;maxRunningJobs&gt;30&lt;/maxRunningJobs=
&gt;</div><div>=A0 &lt;weight&gt;1.0&lt;/weight&gt; <span style=3D"white-sp=
ace:pre-wrap">	</span></div>


<div>&lt;/pool&gt;</div><div>&lt;pool name=3D&quot;research&quot;&gt;<span =
style=3D"white-space:pre-wrap">		</span></div><div>=A0 &lt;minMaps&gt;5&lt;=
/minMaps&gt;<span style=3D"white-space:pre-wrap">	</span> <span style=3D"wh=
ite-space:pre-wrap">	</span></div>


<div>=A0 &lt;minReduces&gt;5&lt;/minReduces&gt;<span style=3D"white-space:p=
re-wrap">	</span></div><div>=A0 &lt;maxRunningJobs&gt;30&lt;/maxRunningJobs=
&gt;<span style=3D"white-space:pre-wrap">	</span></div><div>=A0 &lt;weight&=
gt;1.0&lt;/weight&gt;=A0</div>


<div>&lt;/pool&gt;</div><div>&lt;/allocations&gt;</div><div><br></div><div>=
Note: we have tested with Hadoop Stream job.</div><div><br></div><div><h1>F=
air Scheduler Administration</h1>
<h2>Pools</h2>
<table border=3D"2" cellpadding=3D"5" cellspacing=3D"2">
<tbody><tr><th rowspan=3D"2">Pool</th><th rowspan=3D"2">Running Jobs</th><t=
h colspan=3D"4">Map Tasks</th><th colspan=3D"4">Reduce Tasks</th><th rowspa=
n=3D"2">Scheduling Mode</th></tr>
<tr><th>Min Share</th><th>Max Share</th><th>Running</th><th>Fair Share</th>=
<th>Min Share</th><th>Max Share</th><th>Running</th><th>Fair Share</th></tr=
>
<tr><td>research</td><td>1</td><td>5</td><td>-</td><td>90</td><td>690.0</td=
><td>5</td><td>-</td><td>0</td><td>0.0</td><td>FAIR</td></tr>
<tr><td>tech</td><td>3</td><td>5</td><td>-</td><td>1266</td><td>690.0</td><=
td>5</td><td>-</td><td>24</td><td>24.0</td><td>FAIR</td></tr>
<tr><td>default</td><td>0</td><td>0</td><td>-</td><td>0</td><td>0.0</td><td=
>0</td><td>-</td><td>0</td><td>0.0</td><td>FAIR</td></tr>
</tbody></table>
<h2>Running Jobs</h2>
<table border=3D"2" cellpadding=3D"5" cellspacing=3D"2">
<tbody><tr><th rowspan=3D"2">Submitted</th><th rowspan=3D"2">JobID</th><th =
rowspan=3D"2">User</th><th rowspan=3D"2">Name</th><th rowspan=3D"2">Pool</t=
h><th rowspan=3D"2">Priority</th><th colspan=3D"4">Map Tasks</th><th colspa=
n=3D"4">Reduce Tasks</th>


</tr><tr>
<th>Finished</th><th>Running</th><th>Fair Share</th><th>Weight</th><th>Fini=
shed</th><th>Running</th><th>Fair Share</th><th>Weight</th></tr>
<tr>
<td>Jan 16, 08:51</td>
<td><a href=3D"http://172.16.30.122:50030/jobdetails.jsp?jobid=3Djob_201301=
071639_2118" target=3D"_blank">job_201301071639_2118</a></td><td>tech</td>
<td>streamjob5335328828469969152.jar</td>
<td>


</td>
<td>


</td>
<td><a href=3D"tel:30466%20%2F%2053724" value=3D"+13046653724" target=3D"_b=
lank">30466 / 53724</a></td><td>583</td><td>313.5</td>
<td>1.0</td>
<td>0 / 24</td><td>0</td><td>0.0</td>
<td>1.0</td>
</tr><tr>
<td>Jan 16, 09:56</td>
<td><a href=3D"http://172.16.30.122:50030/jobdetails.jsp?jobid=3Djob_201301=
071639_2147" target=3D"_blank">job_201301071639_2147</a></td><td>research</=
td>
<td>streamjob8832181817213433660.jar</td>
<td>


</td>
<td>


</td>
<td>4175 / 9581</td><td>90</td><td>690.0</td>
<td>1.0</td>
<td>0 / 24</td><td>0</td><td>0.0</td>
<td>1.0</td>
</tr><tr>
<td>Jan 16, 10:01</td>
<td><a href=3D"http://172.16.30.122:50030/jobdetails.jsp?jobid=3Djob_201301=
071639_2148" target=3D"_blank">job_201301071639_2148</a></td><td>tech</td>
<td>streamjob8773848575543653055.jar</td>
<td>


</td>
<td>


</td>
<td>1842 / 15484</td><td>620</td><td>313.5</td>
<td>1.0</td>
<td>0 / 24</td><td>0</td><td>0.0</td>
<td>1.0</td>
</tr><tr>
<td>Jan 16, 10:08</td>
<td><a href=3D"http://172.16.30.122:50030/jobdetails.jsp?jobid=3Djob_201301=
071639_2155" target=3D"_blank">job_201301071639_2155</a></td><td>tech</td>
<td>counterfactualsim-prod.eagle-EagleDepthSignalDisabled-prod.eagle</td>
<td>


</td>
<td>


</td>
<td>387 / 450</td><td>63</td><td>63.0</td>
<td>1.0</td>
<td>0 / 24</td><td>24</td><td>24.0</td>
<td>1.0</td></tr></tbody></table></div></div>

<p></p>

-- <br>
=A0<br>
=A0<br>
=A0<br>
</div></blockquote></div></div></span>
                  =20
                  =20
                  =20
                  =20
                </div></blockquote><div>
                    <br>
                </div>
            </div></div></div></blockquote></div><br></div>
</div></div></span>
                 =20
                 =20
                 =20
                 =20
                </div></blockquote><div>
                    <br>
                </div>
            </div></div></span>
                =20
                =20
                =20
                =20
                </blockquote>
                =20
                <div>
                    <br>
                </div>
            </div></div></blockquote></div><br>

--e89a8f5029a4671bd204d36ad712--