Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0C5C599CA for ; Tue, 27 Nov 2012 10:22:06 +0000 (UTC) Received: (qmail 17937 invoked by uid 500); 27 Nov 2012 10:22:01 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 17650 invoked by uid 500); 27 Nov 2012 10:22:00 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 17623 invoked by uid 99); 27 Nov 2012 10:21:59 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Nov 2012 10:21:59 +0000 X-ASF-Spam-Status: No, hits=-0.1 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of amits@infolinks.com designates 207.126.144.119 as permitted sender) Received: from [207.126.144.119] (HELO eu1sys200aog105.obsmtp.com) (207.126.144.119) by apache.org (qpsmtpd/0.29) with SMTP; Tue, 27 Nov 2012 10:21:52 +0000 Received: from mail-fa0-f70.google.com ([209.85.161.70]) (using TLSv1) by eu1sys200aob105.postini.com ([207.126.147.11]) with SMTP ID DSNKULSULJyGhlk8+CxAHX2BaVMbepsvvsI4@postini.com; Tue, 27 Nov 2012 10:21:32 UTC Received: by mail-fa0-f70.google.com with SMTP id t1so9312915fae.5 for ; Tue, 27 Nov 2012 02:21:32 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=x3GoNDGsQPD3fWrYs/MTi03JfbM1fvNTyPhjME4dyQY=; b=mCJUC0/M1AXBVcs2+sOxxgRGocmx4X10XDNqS+G3pUSqoyPvyUY8ykCCfxjCEwirSn x/zlJ+TZMpo9o6QOiYZYSU6j83DrdbRYQIXfOgxN7eg/xNQHAX7eFG618YjkZsbQcHJb 1RxtqDxk547CbwvubDGDRBbxE6Q3Koq7oW74SLyeKXfWREbDznAabhnX1zFUT9LayA5O jiv6J24lBd6vuRrLJmAxwHgN5NlbZOxKXqOi9uUKjjgaWBr7DJXjBThxcqNMCloeKdUK TCIoRfCowXLjrvE58jrB9Bf6BhDXCYdVeeIvGVw6jo9JiiQzbGqtKQeeulRQ60nTO2Cd HztQ== Received: by 10.112.98.131 with SMTP id ei3mr6535183lbb.63.1354011691964; Tue, 27 Nov 2012 02:21:31 -0800 (PST) MIME-Version: 1.0 Received: by 10.112.98.131 with SMTP id ei3mr6535176lbb.63.1354011691737; Tue, 27 Nov 2012 02:21:31 -0800 (PST) Received: by 10.114.38.204 with HTTP; Tue, 27 Nov 2012 02:21:31 -0800 (PST) In-Reply-To: References: Date: Tue, 27 Nov 2012 12:21:31 +0200 Message-ID: Subject: Re: Hadoop 1.0.4 Performance Problem From: Amit Sela To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=f46d0401fb95e8640204cf776993 X-Gm-Message-State: ALoCoQkdj+WoVXAW7YvOZ6V3eKHKeXdaXwURsAIL/FXnOJAaYipMN5Ni2ozpq5vA/6hYm3ORWgEtPpOKx89rzlPFm/bZ7eS8RW5nbPGZ8vuntAYU3PAV/4ItJjNzDTWaKOclpv+Cs1wE9LutdfkJtgqNRwVyveyOp4mVXMY07NFhG6fvs7QLqxY= X-Virus-Checked: Checked by ClamAV on apache.org --f46d0401fb95e8640204cf776993 Content-Type: text/plain; charset=ISO-8859-1 So this is a FairScheduler problem ? We are using the default Hadoop scheduler. Is there a reason to use the Fair Scheduler if most of the time we don't have more than 4 jobs running simultaneously ? On Tue, Nov 27, 2012 at 12:00 PM, Harsh J wrote: > Hi Amit, > > He means the mapred.fairscheduler.assignmultiple FairScheduler > property. It is true by default, which works well for most workloads > if not benchmark style workloads. I would not usually trust that as a > base perf. measure of everything that comes out of an upgrade. > > The other JIRA, MAPREDUCE-4451, has been resolved for 1.2.0. > > On Tue, Nov 27, 2012 at 3:20 PM, Amit Sela wrote: > > Hi Jon, > > > > I recently upgraded our cluster from Hadoop 0.20.3-append to Hadoop 1.0.4 > > and I haven't noticed any performance issues. By "multiple assignment > > feature" do you mean speculative execution > > (mapred.map.tasks.speculative.execution and > > mapred.reduce.tasks.speculative.execution) ? > > > > > > On Mon, Nov 26, 2012 at 11:49 PM, Jon Allen wrote: > >> > >> Problem solved, but worth warning others about. > >> > >> Before the upgrade the reducers for the terasort process had been evenly > >> distributed around the cluster - one per task tracker in turn, looping > >> around the cluster until all tasks were allocated. After the upgrade > all > >> reduce task had been submitted to small number of task trackers - submit > >> tasks until the task tracker slots were full and then move onto the next > >> task tracker. Skewing the reducers like this quite clearly hit the > >> benchmark performance. > >> > >> The reason for this turns out to be the fair scheduler rewrite > >> (MAPREDUCE-2981) that appears to have subtly modified the behaviour of > the > >> assign multiple property. Previously this property caused a single map > and a > >> single reduce task to be allocated in a task tracker heartbeat (rather > than > >> the default of a map or a reduce). After the upgrade it allocates as > many > >> tasks as there are available task slots. Turning off the multiple > >> assignment feature returned the terasort to its pre-upgrade performance. > >> > >> I can see potential benefits to this change and need to think through > the > >> consequences to real world applications (though in practice we're > likely to > >> move away from fair scheduler due to MAPREDUCE-4451). Investigating > this > >> has been a pain so to warn other user is there anywhere central that > can be > >> used to record upgrade gotchas like this? > >> > >> > >> On Fri, Nov 23, 2012 at 12:02 PM, Jon Allen > wrote: > >>> > >>> Hi, > >>> > >>> We've just upgraded our cluster from Hadoop 0.20.203 to 1.0.4 and have > >>> hit performance problems. Before the upgrade a 15TB terasort took > about 45 > >>> minutes, afterwards it takes just over an hour. Looking in more > detail it > >>> appears the shuffle phase has increased from 20 minutes to 40 minutes. > Does > >>> anyone have any thoughts about what's changed between these releases > that > >>> may have caused this? > >>> > >>> The only change to the system has been to Hadoop. We moved from a > >>> tarball install of 0.20.203 with all processes running as hadoop to an > RPM > >>> deployment of 1.0.4 with processes running as hdfs and mapred. > Nothing else > >>> has changed. > >>> > >>> As a related question, we're still running with a configuration that > was > >>> tuned for version 0.20.1. Are there any recommendations for tuning > >>> properties that have been introduced in recent versions that are worth > >>> investigating? > >>> > >>> Thanks, > >>> Jon > >> > >> > > > > > > -- > Harsh J > --f46d0401fb95e8640204cf776993 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
So this is a FairScheduler problem ?=A0
We are using t= he default Hadoop scheduler. Is there a reason to use the Fair Scheduler if= most of the time we don't have more than 4 jobs running=A0simultaneous= ly ?=A0

On Tue, Nov 27, 2012 at 12:00 PM, Harsh J <harsh@cloudera.com> wrote:
Hi Amit,

He means the mapred.fairscheduler.assignmultiple FairScheduler
property. It is true by default, which works well for most workloads
if not benchmark style workloads. I would not usually trust that as a
base perf. measure of everything that comes out of an upgrade.

The other JIRA, MAPREDUCE-4451, has been resolved for 1.2.0.

On Tue, Nov 27, 2012 at 3:20 PM, Amit Sela <amits@infolinks.com> wrote:
> Hi Jon,
>
> I recently upgraded our cluster from Hadoop 0.20.3-append to Hadoop 1.= 0.4
> and I haven't noticed any performance issues. By =A0"multiple= assignment
> feature" do you mean speculative execution
> (mapred.map.tasks.speculative.execution and
> mapred.reduce.tasks.speculative.execution) ?
>
>
> On Mon, Nov 26, 2012 at 11:49 PM, Jon Allen <jayayedev@gmail.com> wrote:
>>
>> Problem solved, but worth warning others about.
>>
>> Before the upgrade the reducers for the terasort process had been = evenly
>> distributed around the cluster - one per task tracker in turn, loo= ping
>> around the cluster until all tasks were allocated. =A0After the up= grade all
>> reduce task had been submitted to small number of task trackers - = submit
>> tasks until the task tracker slots were full and then move onto th= e next
>> task tracker. =A0Skewing the reducers like this quite clearly hit = the
>> benchmark performance.
>>
>> The reason for this turns out to be the fair scheduler rewrite
>> (MAPREDUCE-2981) that appears to have subtly modified the behaviou= r of the
>> assign multiple property. Previously this property caused a single= map and a
>> single reduce task to be allocated in a task tracker heartbeat (ra= ther than
>> the default of a map or a reduce). =A0After the upgrade it allocat= es as many
>> tasks as there are available task slots. =A0Turning off the multip= le
>> assignment feature returned the terasort to its pre-upgrade perfor= mance.
>>
>> I can see potential benefits to this change and need to think thro= ugh the
>> consequences to real world applications (though in practice we'= ;re likely to
>> move away from fair scheduler due to MAPREDUCE-4451). =A0Investiga= ting this
>> has been a pain so to warn other user is there anywhere central th= at can be
>> used to record upgrade gotchas like this?
>>
>>
>> On Fri, Nov 23, 2012 at 12:02 PM, Jon Allen <jayayedev@gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> We've just upgraded our cluster from Hadoop 0.20.203 to 1.= 0.4 and have
>>> hit performance problems. =A0Before the upgrade a 15TB terasor= t took about 45
>>> minutes, afterwards it takes just over an hour. =A0Looking in = more detail it
>>> appears the shuffle phase has increased from 20 minutes to 40 = minutes. =A0Does
>>> anyone have any thoughts about what's changed between thes= e releases that
>>> may have caused this?
>>>
>>> The only change to the system has been to Hadoop. =A0We moved = from a
>>> tarball install of 0.20.203 with all processes running as hado= op to an RPM
>>> deployment of 1.0.4 with processes running as hdfs and mapred.= =A0Nothing else
>>> has changed.
>>>
>>> As a related question, we're still running with a configur= ation that was
>>> tuned for version 0.20.1. Are there any recommendations for tu= ning
>>> properties that have been introduced in recent versions that a= re worth
>>> investigating?
>>>
>>> Thanks,
>>> Jon
>>
>>
>



--
Harsh J

--f46d0401fb95e8640204cf776993--