Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: core-user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: local policy)
Message-Id: <3E9F6237-39EC-4026-8132-60439D4948B5@wensel.net>
From: Chris K Wensel <chris@wensel.net>
To: core-user@hadoop.apache.org
In-Reply-To: <1f2901c98633$f6e8dd40$e4ba97c0$@com>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (Apple Message framework v930.3)
Subject: Re: Control over max map/reduce tasks per job
Date: Tue, 3 Feb 2009 11:33:43 -0800
References: <1f2901c98633$f6e8dd40$e4ba97c0$@com>

Hey Jonathan

Are you looking to limit the total number of concurrent mapper/ 
reducers a single job can consume cluster wide, or limit the number  
per node?

That is, you have X mappers/reducers, but only can allow N mappers/ 
reducers to run at a time globally, for a given job.

Or, you are cool with all X running concurrently globally, but want to  
guarantee that no node can run more than N tasks from that job?

Or both?

just reconciling the conversation we had last week with this thread.

ckw

On Feb 3, 2009, at 11:16 AM, Jonathan Gray wrote:

> All,
>
>
>
> I have a few relatively small clusters (5-20 nodes) and am having  
> trouble
> keeping them loaded with my MR jobs.
>
>
>
> The primary issue is that I have different jobs that have drastically
> different patterns.  I have jobs that read/write to/from HBase or  
> Hadoop
> with minimal logic (network throughput bound or io bound), others that
> perform crawling (network latency bound), and one huge parsing  
> streaming job
> (very CPU bound, each task eats a core).
>
>
>
> I'd like to launch very large numbers of tasks for network latency  
> bound
> jobs, however the large CPU bound job means I have to keep the max  
> maps
> allowed per node low enough as to not starve the Datanode and  
> Regionserver.
>
>
>
> I'm an HBase dev but not familiar enough with Hadoop MR code to even  
> know
> what would be involved with implementing this.  However, in talking  
> with
> other users, it seems like this would be a well-received option.
>
>
>
> I wanted to ping the list before filing an issue because it seems like
> someone may have thought about this in the past.
>
>
>
> Thanks.
>
>
>
> Jonathan Gray
>

--
Chris K Wensel
chris@wensel.net
http://www.cascading.org/
http://www.scaleunlimited.com/