hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amit Sela <am...@infolinks.com>
Subject Re: Hadoop 1.0.4 Performance Problem
Date Tue, 27 Nov 2012 09:50:40 GMT
Hi Jon,

I recently upgraded our cluster from Hadoop 0.20.3-append to Hadoop 1.0.4
and I haven't noticed any performance issues. By  "multiple assignment
feature" do you mean speculative execution
and mapred.reduce.tasks.speculative.execution) ?

On Mon, Nov 26, 2012 at 11:49 PM, Jon Allen <jayayedev@gmail.com> wrote:

> Problem solved, but worth warning others about.
> Before the upgrade the reducers for the terasort process had been evenly
> distributed around the cluster - one per task tracker in turn, looping
> around the cluster until all tasks were allocated.  After the upgrade all
> reduce task had been submitted to small number of task trackers - submit
> tasks until the task tracker slots were full and then move onto the next
> task tracker.  Skewing the reducers like this quite clearly hit the
> benchmark performance.
> The reason for this turns out to be the fair scheduler rewrite
> (MAPREDUCE-2981) that appears to have subtly modified the behaviour of the
> assign multiple property. Previously this property caused a single map and
> a single reduce task to be allocated in a task tracker heartbeat (rather
> than the default of a map or a reduce).  After the upgrade it allocates as
> many tasks as there are available task slots.  Turning off the multiple
> assignment feature returned the terasort to its pre-upgrade performance.
> I can see potential benefits to this change and need to think through the
> consequences to real world applications (though in practice we're likely to
> move away from fair scheduler due to MAPREDUCE-4451).  Investigating this
> has been a pain so to warn other user is there anywhere central that can be
> used to record upgrade gotchas like this?
> On Fri, Nov 23, 2012 at 12:02 PM, Jon Allen <jayayedev@gmail.com> wrote:
>> Hi,
>> We've just upgraded our cluster from Hadoop 0.20.203 to 1.0.4 and have
>> hit performance problems.  Before the upgrade a 15TB terasort took about 45
>> minutes, afterwards it takes just over an hour.  Looking in more detail it
>> appears the shuffle phase has increased from 20 minutes to 40 minutes.
>>  Does anyone have any thoughts about what's changed between these releases
>> that may have caused this?
>> The only change to the system has been to Hadoop.  We moved from a
>> tarball install of 0.20.203 with all processes running as hadoop to an RPM
>> deployment of 1.0.4 with processes running as hdfs and mapred.  Nothing
>> else has changed.
>> As a related question, we're still running with a configuration that was
>> tuned for version 0.20.1. Are there any recommendations for tuning
>> properties that have been introduced in recent versions that are worth
>> investigating?
>> Thanks,
>> Jon

View raw message