Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of julnaour@gmail.com designates
 209.85.218.43 as permitted sender)
MIME-Version: 1.0
Date: Fri, 1 Aug 2014 11:41:05 +0200
Message-ID: 
 <CAFGA2c2rY_fcD3DJVQ-OPFRCdGGx5LKEccE+570LqcxmR+_W=w@mail.gmail.com>
Subject: Fair Scheduler issue
From: Julien Naour <julnaour@gmail.com>
To: user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=089e0116060226192604ff8e3004

--089e0116060226192604ff8e3004
Content-Type: text/plain; charset=UTF-8

Hello,

I'm currently using HDP 2.0 so it's Hadoop 2.2.0.
My cluster consist in 4 nodes, 16 coeurs 16 GB RAM 4*3To each.

Recently we passed from 2 users to 8. We need now a more appropriate
Scheduler.
We begin with Capacity Scheduler. There was some issues with the different
queues particularly when using some spark shell that used some resources
for a long time.
So we decide to try Fair Scheduler which seems to be a good solution.
The problem is that FairScheduler doesn't allow all available resources.
It's capped at 73% of the available memory for one jobs 63% for 2 jobs and
45% for 3 jobs. The problem could come from shells that take resources for
a long time.

We tried some configuration like
yarn.scheduler.fair.user-as-default-queue=false
or play with the minimum ressources allocated minResources in
fair-scheduler.xml but it doesn't seems to resolve the issue.

Any advices or good practices to held a good Fair Scheduler?

Regards,

Julien

--089e0116060226192604ff8e3004
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Hello,<div><br></div><div>I&#39;m currently using HDP 2.0 =
so it&#39;s Hadoop 2.2.0.</div><div>My cluster consist in 4 nodes, 16 coeur=
s 16 GB RAM 4*3To each.</div><div><br></div><div>Recently we passed from 2 =
users to 8. We need now a more appropriate Scheduler.</div>
<div>We begin with Capacity Scheduler. There was some issues with the diffe=
rent queues particularly when using some spark shell that used some resourc=
es for a long time.</div><div>So we decide to try Fair Scheduler which seem=
s to be a good solution.</div>
<div>The problem is that FairScheduler doesn&#39;t allow all available reso=
urces. It&#39;s capped at 73% of the available memory for one jobs 63% for =
2 jobs and 45% for 3 jobs. The problem could come from shells that take res=
ources for a long time.</div>
<div><br></div><div>We tried some configuration like =C2=A0<span style=3D"c=
olor:rgb(51,51,51);font-family:monospace;font-size:12px">yarn.scheduler.fai=
r.user-as-default-queue=3Dfalse</span></div><div>or play with the minimum r=
essources allocated minResources in fair-scheduler.xml but it doesn&#39;t s=
eems to resolve the issue.<br>
</div><div><br></div><div>Any advices or good practices to held a good Fair=
 Scheduler?</div><div><br></div><div>Regards,</div><div><br></div><div>Juli=
en</div></div>

--089e0116060226192604ff8e3004--