hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amar Kamat <ama...@yahoo-inc.com>
Subject Re: Jobtracker is out of memory with 100,000 dummy map tasks job
Date Fri, 26 Sep 2008 07:35:05 GMT
Ted Dunning wrote:
> Why do you try to do 100,000 map tasks?  Also, do you mean that you had 100
> nodes, each with 2GB?  If so, that is much too small a machine to try to run
> 1000 tasks on.  It is much better to run about the same number of tasks per
> machine as you have cores (2-3 in your case).   Then you can easily split
> your input into 100,000 pieces which will run in sequence.  For most
> problems, however, it is better to let the system split your data so that
> you get a few tens of seconds of work per split.  It is inefficient to have
> very short tasks and it is inconvenient to have long-running tasks.
>
> On Thu, Sep 25, 2008 at 11:26 PM, 심탁길 <1004shi@nhncorp.com> wrote:
>
>   
>> Hi all
>>
>> Recently I tried 100,000 dummy map tasks job on the 100ea node(2GB, Dual
>> Core, 64Bit machine, Version: 0.16.4) cluster
>>     
I assume you are using hadoop-0.16.4. This issue got fixed in 
hadoop-0.17 where the JT was made a bit more efficient in terms of 
handling large number of fast finishing maps. See  HADOOP-2119 for more 
details.
Amar
>> Map task does nothing but sleeping one minute
>>
>> I found that Jobtracker(1GB Heap) consumes about 650MB of heap memory when
>> the job is 50% done.
>>
>> After all, the job failed at the 90% of progress because Jobtracker hanged
>> up(?) due to out of memory.
>>
>> how do you handle this kind of issue?
>>
>> another related issue:
>>
>> while the above job was being processed, I clicked on the "Pending" on
>> jobdatails.jsp of web UI
>>
>> then, Jobtracker consumed 100% of CPU. and 100% CPU status lasted a couple
>> of minutes
>>
>>
>>     
>
>
>   


Mime
View raw message