Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of amits@infolinks.com designates
 207.126.144.129 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAGpJWjtyddG7cpm1cYoZg+GPeDSFqcjr+jOzyVpW+xuH2X2-kw@mail.gmail.com>
References: <CC997DD6-038F-4E47-A85D-4235C92C470E@gmail.com>
	<CAGpJWjtyddG7cpm1cYoZg+GPeDSFqcjr+jOzyVpW+xuH2X2-kw@mail.gmail.com>
Date: Tue, 27 Nov 2012 11:50:40 +0200
Message-ID: 
 <CAAMYKhqXsLcUosdXcn3m_5kTJsaD07hQ13qzFrD=7C3eoOOMeA@mail.gmail.com>
Subject: Re: Hadoop 1.0.4 Performance Problem
From: Amit Sela <amits@infolinks.com>
To: user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=f46d0401fb9592d2b404cf76fbb2

--f46d0401fb9592d2b404cf76fbb2
Content-Type: text/plain; charset=ISO-8859-1

Hi Jon,

I recently upgraded our cluster from Hadoop 0.20.3-append to Hadoop 1.0.4
and I haven't noticed any performance issues. By  "multiple assignment
feature" do you mean speculative execution
(mapred.map.tasks.speculative.execution
and mapred.reduce.tasks.speculative.execution) ?


On Mon, Nov 26, 2012 at 11:49 PM, Jon Allen <jayayedev@gmail.com> wrote:

> Problem solved, but worth warning others about.
>
> Before the upgrade the reducers for the terasort process had been evenly
> distributed around the cluster - one per task tracker in turn, looping
> around the cluster until all tasks were allocated.  After the upgrade all
> reduce task had been submitted to small number of task trackers - submit
> tasks until the task tracker slots were full and then move onto the next
> task tracker.  Skewing the reducers like this quite clearly hit the
> benchmark performance.
>
> The reason for this turns out to be the fair scheduler rewrite
> (MAPREDUCE-2981) that appears to have subtly modified the behaviour of the
> assign multiple property. Previously this property caused a single map and
> a single reduce task to be allocated in a task tracker heartbeat (rather
> than the default of a map or a reduce).  After the upgrade it allocates as
> many tasks as there are available task slots.  Turning off the multiple
> assignment feature returned the terasort to its pre-upgrade performance.
>
> I can see potential benefits to this change and need to think through the
> consequences to real world applications (though in practice we're likely to
> move away from fair scheduler due to MAPREDUCE-4451).  Investigating this
> has been a pain so to warn other user is there anywhere central that can be
> used to record upgrade gotchas like this?
>
>
> On Fri, Nov 23, 2012 at 12:02 PM, Jon Allen <jayayedev@gmail.com> wrote:
>
>> Hi,
>>
>> We've just upgraded our cluster from Hadoop 0.20.203 to 1.0.4 and have
>> hit performance problems.  Before the upgrade a 15TB terasort took about 45
>> minutes, afterwards it takes just over an hour.  Looking in more detail it
>> appears the shuffle phase has increased from 20 minutes to 40 minutes.
>>  Does anyone have any thoughts about what's changed between these releases
>> that may have caused this?
>>
>> The only change to the system has been to Hadoop.  We moved from a
>> tarball install of 0.20.203 with all processes running as hadoop to an RPM
>> deployment of 1.0.4 with processes running as hdfs and mapred.  Nothing
>> else has changed.
>>
>> As a related question, we're still running with a configuration that was
>> tuned for version 0.20.1. Are there any recommendations for tuning
>> properties that have been introduced in recent versions that are worth
>> investigating?
>>
>> Thanks,
>> Jon
>
>
>

--f46d0401fb9592d2b404cf76fbb2
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Hi Jon,=A0<div><br></div><div>I recently upgraded our clus=
ter from Hadoop 0.20.3-append to Hadoop 1.0.4 and I haven&#39;t noticed any=
=A0performance issues. By =A0&quot;multiple assignment feature&quot; do you=
 mean speculative execution (mapred.map.tasks.speculative.execution and=A0m=
apred.reduce.tasks.speculative.execution) ?</div>
<div><br></div><div><br><div class=3D"gmail_quote">On Mon, Nov 26, 2012 at =
11:49 PM, Jon Allen <span dir=3D"ltr">&lt;<a href=3D"mailto:jayayedev@gmail=
.com" target=3D"_blank">jayayedev@gmail.com</a>&gt;</span> wrote:<br><block=
quote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc=
 solid;padding-left:1ex">
Problem solved, but worth warning others about.<div><br></div><div>Before t=
he upgrade the reducers for the terasort process had been evenly distribute=
d around the cluster - one per task tracker in turn, looping around the clu=
ster until all tasks were allocated. =A0After the upgrade all reduce task h=
ad been submitted to small number of task trackers - submit tasks until the=
 task tracker slots were full and then move onto the next task tracker. =A0=
Skewing the reducers like this quite clearly hit the benchmark performance.=
</div>

<div><br></div><div>The reason for this turns out to be the fair scheduler =
rewrite (MAPREDUCE-2981) that appears to have subtly modified the behaviour=
 of the assign multiple property. Previously this property caused a single =
map and a single reduce task to be allocated in a task tracker heartbeat (r=
ather than the default of a map or a reduce). =A0After the upgrade it alloc=
ates as many tasks as there are available task slots. =A0Turning off the mu=
ltiple assignment feature returned the terasort to its pre-upgrade=A0perfor=
mance.</div>

<div><br></div><div>I can see potential benefits to this change and need to=
 think through the consequences to real world applications (though in pract=
ice we&#39;re likely to move away from fair scheduler due to=A0<span style=
=3D"font-family:Calibri,sans-serif;font-size:11pt">MAPREDUCE-4451). =A0I</s=
pan>nvestigating this has been a pain so to warn other user is there anywhe=
re central that can be used to record upgrade gotchas like this?</div>
<div class=3D"HOEnZb"><div class=3D"h5">
<div><br><br><div class=3D"gmail_quote">On Fri, Nov 23, 2012 at 12:02 PM, J=
on Allen <span dir=3D"ltr">&lt;<a href=3D"mailto:jayayedev@gmail.com" targe=
t=3D"_blank">jayayedev@gmail.com</a>&gt;</span> wrote:<br><blockquote class=
=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padd=
ing-left:1ex">

Hi,<br>
<br>
We&#39;ve just upgraded our cluster from Hadoop 0.20.203 to 1.0.4 and have =
hit performance problems. =A0Before the upgrade a 15TB terasort took about =
45 minutes, afterwards it takes just over an hour. =A0Looking in more detai=
l it appears the shuffle phase has increased from 20 minutes to 40 minutes.=
 =A0Does anyone have any thoughts about what&#39;s changed between these re=
leases that may have caused this?<br>


<br>
The only change to the system has been to Hadoop. =A0We moved from a tarbal=
l install of 0.20.203 with all processes running as hadoop to an RPM deploy=
ment of 1.0.4 with processes running as hdfs and mapred. =A0Nothing else ha=
s changed.<br>


<br>
As a related question, we&#39;re still running with a configuration that wa=
s tuned for version 0.20.1. Are there any recommendations for tuning proper=
ties that have been introduced in recent versions that are worth investigat=
ing?<br>


<br>
Thanks,<br>
Jon</blockquote></div><br></div>
</div></div></blockquote></div><br></div></div>

--f46d0401fb9592d2b404cf76fbb2--