hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Out of memory after Map tasks
Date Thu, 25 May 2006 18:25:47 GMT
Vijay Murthi wrote:
> I am trying to understand what happens during the time duration when Map task got finished
and reduce task starts executing. I have 2 machines with 4 process + 4 Gigs on each with NFS
(not dfs) to process 50 Gigs of data. Map taks finish completion successfully. After that
I see the following on the tasktracker log.
> 
> "Exception in thread "Server handler 1 on 50040" java.lang.OutOfMemoryError: Java heap
space"

Are you running the current trunk?  My guess is that you are.  If so, 
then this error is "normal", things should keep running.

Are you running a 64-bit kernel?  If not, can it really take advantage 
of all 4GB?  In my experience, 32-bit JVM's can't effectively use more 
than around 1.5GB, and a 32-bit kernel can't effectively use all 4GB, 
but I may be wrong on that last count.

> Lister below is the configuration parameter. Am I setting JAVA memory heap very low compared
to io.sort.mb or file buffer size? I thought Tasktracker just pushes the job to the child
node, does it because of something like moving data ? If so is there a buffer size I can set
a limit? Also, I noticed on mapred local each under the directotries for reduce files start
growing even after tasktracker has "out of memory error".

Sorting does indeed happen in the child process.

4MB buffers for file streams seems large to me.

You might increase the io.sort.factor.  With 500MB for sorting and a 
sort factor of 100, each sort stream would get a 5MB buffer, plenty to 
ensure that transfer time dominates seek, since the break-even point is 
around 100kB.  So you could even use a sort factor of 500.  That would 
make sorts a lot faster.

Also why are you setting the task timeout so high?  Do you have mappers 
or reducers that take a long time per entry and are not calling 
Reporter.setStatus() regularly?  That can cause tasks to time out.

Doug

> -------------------------------------------------------------------
>   <name>io.sort.factor</name>
>   <value>10</value>
> 
>   <name>io.sort.mb</name>
>   <value>500</value>
> 
>   <name>io.skip.checksum.errors</name>
>   <value>false</value>
> 
>   <name>io.file.buffer.size</name>
>   <value>4096000</value>
> 
> 
>   <name>mapred.reduce.tasks</name>
>   <value>6</value>
> 
>   <name>mapred.task.timeout</name>
>   <value>100000000000</value>
> 
>   <name>mapred.tasktracker.tasks.maximum</name>
>   <value>3</value>
> 
>   <name>mapred.child.java.opts</name>
>   <value>-Xmx1024m</value>
> 
>   <name>mapred.combine.buffer.size</name>
>   <value>100000</value>
> 
>   <name>mapred.speculative.execution</name>
>   <value>true</value>
> 
>   <name>ipc.client.timeout</name>
>   <value>60000</value>
> 
> ------------------------------------------------------------
> # The maximum amount of heap to use, in MB. Default is 1000.
> export HADOOP_HEAPSIZE=1024
> ------------------------------------------------------------

Mime
View raw message