hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Hammerton <james.hammer...@mendeley.com>
Subject Re: Mappers crashing due to running out of heap space during initialisation
Date Wed, 27 Apr 2011 10:02:54 GMT
Hi,

I lowered the io.sort.mb to 100mb from 200mb and that allowed my job to get
through the mapping phase, thanks Chris.

However what I don't understand is why the memory got used up in the first
place when the mapper only buffers the previous input and the maximum
serialised size of the objects it's dealing with is 201k.

This is why I asked about what Hadoop is doing in the area of code where the
exception was occurring - as far as I can tell, my mapper code wasn't even
getting run.

Regards,

James

On Tue, Apr 26, 2011 at 8:02 PM, Chris Douglas <cdouglas@apache.org> wrote:

> Lower io.sort.mb or raise the heap size for the task. -C
>
> On Tue, Apr 26, 2011 at 10:55 AM, James Hammerton
> <james.hammerton@mendeley.com> wrote:
> > Hi,
> >
> > I have a job that runs fine with a small data set in pseudo-distributed
> mode
> > on my desktop workstation but when I run it on our Hadoop cluster it
> falls
> > over, crashing during the initialisation of some of the mappers. The
> errors
> > look like this:
> >
> > 2011-04-26 14:34:04,494 FATAL org.apache.hadoop.mapred.TaskTracker: Error
> > running child : java.lang.OutOfMemoryError: Java heap space
> >       at
> > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:743)
> >
> >       at
> >
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:487)
> >       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:575)
> >       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> >
> >       at org.apache.hadoop.mapred.Child.main(Child.java:170)
> >
> > The mapper itself buffers only the previous input and the objects are
> small
> > (max 201K in size, most under 50k), so I don't know why this is
> happening.
> >
> > What exactly is happening in the area of code referred to in the stack
> > trace?
> >
> > Cheers,
> >
> > James
> >
> > --
> > James Hammerton | Senior Data Mining Engineer
> > www.mendeley.com/profiles/james-hammerton
> >
> > Mendeley Limited | London, UK | www.mendeley.com
> > Registered in England and Wales | Company Number 6419015
> >
> >
> >
> >
>



-- 
James Hammerton | Senior Data Mining Engineer
www.mendeley.com/profiles/james-hammerton

Mendeley Limited | London, UK | www.mendeley.com
Registered in England and Wales | Company Number 6419015

Mime
View raw message