Mailing-List: contact hadoop-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hadoop-user@lucene.apache.org
Received-SPF: pass (asf.osuosl.org: domain of sutter@gmail.com designates
 64.233.162.196 as permitted sender)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws;
        s=beta; d=gmail.com;
        h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references;
        b=NEICEL9aKQHFUB9n3vadk1o5gYw0R677tG3s+SgPufVGWp2RhMo4lPryhyHfw+5uRBsAzYaQWJVtScAJd1PGDBTEoBrN3Wlc5VmM55YKNqcPYq/yhpyWRAN8Q9s5OFcQEOfqwfzippaE4X40aPY7LaWOfRmHiaIMlvaJrWPJCUs=
Message-ID: <e1d10fc00607251101h116fb2cdn6692fac03a643c4@mail.gmail.com>
Date: Tue, 25 Jul 2006 11:01:00 -0700
From: "Paul Sutter" <sutter@gmail.com>
To: hadoop-user@lucene.apache.org
Subject: Re: Task type priorities during scheduling ?
In-Reply-To: <44C5D507.9080203@apache.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
References: 
 <358D735BD7AB45429F2B1C14F38E10F70465C729@DEN-EXM-03.corp.ebay.com>
	 <007501c6ac3c$8dc7bad0$a248480a@ds.corp.yahoo.com>
	 <e1d10fc00607201757k4ea9d879t694467f63d9e8712@mail.gmail.com>
	 <80CCE470-BF9D-4AC6-9B76-F55EE0E3E31B@yahoo-inc.com>
	 <44C4801F.5010903@apache.org>
	 <e1d10fc00607240328p4936d397vea380755e259d2ac@mail.gmail.com>
	 <44C5D507.9080203@apache.org>

First, It matters in the case of concurrent jobs. If you submit a 20
minute job while a 20 hour job is running, it would be nice if the
reducers for the 20 minute job could get a chance to run before the 20
hour job's mappers have all finished. So even without a throughput
improvement, you have an important capability (although it may require
another minor tweak or two to make possible).

Secondarily, we often have stragglers, where one mapper runs slower
than the others. When this happens, we end up with a largely idle
cluster for as long as an hour. In cases like these, good support for
concurrent jobs _would_ improve throughput.

Paul

On 7/25/06, Doug Cutting <cutting@apache.org> wrote:
> Paul Sutter wrote:
> > it should be possible to have lots of tasks in the shuffle phase
> > (mostly, sitting around waiting for mappers to run), but only have
> > "about" one actual reduce phase running per cpu (or whatever works for
> > each of our apps) that gets enough memory for a sorter, does
> > substantial computation, etc.
>
> Ah, now I see your point, although I don't see how this would improve
> overall throughput.  In most cases, the optimal configuration is for the
> total number of reduce tasks to be roughly the total number of reduces
> that can run at once.  So there is no queue of waiting reduce tasks to
> schedule.
>
> Doug
>
>