Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: core-user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of doug@conviva.com designates
 216.82.254.51 as permitted sender)
Message-Id: <74FC3A43-A7A9-42A4-A988-30F116880368@conviva.com>
From: Doug Balog <doug@conviva.com>
To: core-user@hadoop.apache.org
In-Reply-To: <623d9cf40810281449h58f29fa7n7a883ddd9982a0fa@mail.gmail.com>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (Apple Message framework v929.2)
Subject: Re: Why separate Map/Reduce task limits per node ?
Date: Tue, 28 Oct 2008 18:50:43 -0400
References: <8BA46231-42AB-4524-B35D-752FEEF09C46@conviva.com>
 <623d9cf40810271526h5e315eadn476501387efeba9c@mail.gmail.com>
 <39AA7D48-EA2B-44DA-89A7-FA3A86F20DBF@conviva.com>
 <623d9cf40810281449h58f29fa7n7a883ddd9982a0fa@mail.gmail.com>

Thanks Alex.
I found a JIRA that relates to my question
https://issues.apache.org/jira/browse/HADOOP-3420

If I decide to do something about this, I'll follow up with HADOOP-3420.

Thanks,
DougB


On Oct 28, 2008, at 5:49 PM, Alex Loddengaard wrote:

> I understand your question now, Doug; thanks for clarifying.   
> However, I
> don't think I can give you a great answer.  I'll give it a shot,  
> though:
> It does seem like having a single task configuration in theory would  
> improve
> utilization, but it might also make things worse.  For example,  
> generally
> speaking, reducers take longer to execute.  This means that it would  
> be
> possible for some nodes to only perform reduce tasks for a given  
> time period
> in a setup where each node had a dynamic amount of mappers and  
> reducers.  If
> a node was running all reducers, then that node would have lots of  
> output
> data being written to it, hence not evenly distributing data well.   
> Perhaps
> one could argue that over time data would still be distributed evenly,
> though.
>
> That's the best I can do I think.  Can others chime in?
>
> Alex
>
> On Tue, Oct 28, 2008 at 1:41 PM, Doug Balog <doug@conviva.com> wrote:
>
>> Hi Alex, I'm sorry, I think you misunderstood my question. Let me  
>> explain
>> some more.
>>
>> I have a hadoop cluster of dual quad core machines.
>> I'm using hadoop-0.18.1 with Matei's fairscheduler patch
>> https://issues.apache.org/jira/browse/HADOOP-3746 running in FIFO  
>> mode.
>> I have about 5 different jobs running in a pipeline. The number of
>> map/reduce tasks per job
>> varies based on the input data.
>> I assign the various jobs different priorities, and Matei's FIFO  
>> scheduler
>> does almost exactly what I want.
>> (The default scheduler did a horrible job with our workload,  
>> because it
>> prefers map tasks.)
>>
>> I'm trying to tune the tasks per node to fully utilize my cluster,   
>> my goal
>> < 10% idle.
>> I'm pretty sure my jobs are cpu bound. I can control the number of  
>> tasks
>> per node by
>> setting mapred.tasktracker.map.tasks.maximum
>> and mapred.tasktracker.reduce.tasks.maximum in hadoop-site.xml.
>>
>> But I don't have a fixed number of maps and reduces that I run, so  
>> saying
>> 5+3 tends to leave
>> my nodes more idle than I want. I just want to say  run 8 tasks per  
>> node, I
>> don't care what the mix between
>> map and reduce tasks per node.
>>
>> I've been wondering why there are separate task limits for map and  
>> reduce.
>>>> Why not a single generic  task limit per node ?
>>>>
>>>
>>>
>>
>>
>> The only reason I can think of for having
>> separate map and reduce task limits, is the default scheduler.
>> It wants to schedule all map tasks first, so you really need to  
>> limit the
>> number of
>> them so that reduces have a chance to run.
>>
>> Thanks for any insight,
>> Doug
>>
>>
>> On Oct 27, 2008, at 6:26 PM, Alex Loddengaard wrote:
>>
>> In most jobs, map and reduce tasks are significantly differently, and
>>> their
>>> runtimes vary as well.  The number of reducers also determines how  
>>> many
>>> output files you have.  So in the case when you would want one  
>>> output
>>> file,
>>> having a single generic task limit would mean that you'd also have  
>>> one
>>> mapper.  This would be quite a limiting setup.
>>> Hope this helps.
>>>
>>> Alex
>>>
>>> On Mon, Oct 27, 2008 at 1:31 PM, Doug Balog <doug@conviva.com>  
>>> wrote:
>>>
>>> Hi,
>>>> I've been wondering why there are separate task limits for map and
>>>> reduce.
>>>> Why not a single generic  task limit per node ?
>>>>
>>>> Thanks for any insight,
>>>>
>>>> Doug
>>>>
>>>>
>>>>
>>>>
>>