hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das" <d...@yahoo-inc.com>
Subject RE: OOM error with large # of map tasks
Date Thu, 01 May 2008 19:58:41 GMT
Hi Lili, sorry that I missed one important detail in my last response -
tasks that complete successfully on tasktrackers are marked as
COMMIT_PENDING by the tasktracker itself. The JobTracker takes those
COMMIT_PENDING tasks, promotes their output (if applicable), and then marks
them as SUCCEEDED. However, tasktrackers are not notified about these and
the state of the tasks in the tasktrackers don't change, i.e., they remain
in COMMIT_PENDING state. In short, COMMIT_PENDING at the tasktracker's end
doesn't necessarily mean the job is stuck.

The tasktracker keeps in its memory the objects corresponding to tasks it
runs. Those objects are purged on job completion/failure only. This explains
why you see so many tasks in the COMMIT_PENDING state. I believe it will
create one jobconf for every task it launches.

I am only concerned about the memory consumption by the jobconf objects. As
per your report, it is ~1.6 MB per jobconf. 

You could try things out with an increased heap size for the
tasktrackers/tasks. You could increase the heap size for the tasktracker by
changing the value of HADOOP_HEAPSIZE in hadoop-env.sh, and the tasks' heap
size can be increased by tweaking the value of mapred.child.java.opts in the
hadoop-site.xml for your job.

> -----Original Message-----
> From: Lili Wu [mailto:liliwu@gmail.com] 
> Sent: Thursday, May 01, 2008 4:19 AM
> To: core-user@hadoop.apache.org
> Subject: Re: OOM error with large # of map tasks
> 
> Hi Devaraj,
> 
> We don't have any special configuration on the job conf...
> 
> We only allow 3 map tasks and 3 reduce tasks in *one* node at 
> any time.  So we are puzzled why there are 572 job confs on 
> *one* node?  From the heap dump, we see there are 569 MapTask 
> and 3 ReduceTask, (and that corresponds to 1138 MapTaskStatus 
> and 6 ReduceTaskStatus.)
> 
> We *think* many Map tasks were stuck in COMMIT_PENDING stage, 
> because in heap dump, we saw a lot of MapTaskStatus objects 
> being in either "UNASSIGNED" or "COMMIT_PENDING" state (the 
> runState variable in
> MapTaskStatus).   Then we took a look at another node on UI 
> just now,  for a
> given task tracker, under "Non-runnign tasks", there are at 
> least 200 or 300 COMMIT_PENDING tasks.  It appears they stuck too.
> 
> Thanks a lot for your help!
> 
> Lili
> 
> 
> On Wed, Apr 30, 2008 at 2:14 PM, Devaraj Das 
> <ddas@yahoo-inc.com> wrote:
> 
> > Hi Lili, the jobconf memory consumption seems quite high. Could you 
> > please let us know if you pass anything in the jobconf of jobs that 
> > you run? I think you are seeing the 572 objects since a job 
> is running 
> > and the TaskInProgress objects for tasks of the running job 
> are kept 
> > in memory (but I need to double check this).
> > Regarding COMMIT_PENDING, yes it means that tasktracker has 
> finished 
> > executing the task but the jobtracker hasn't committed the 
> output yet. 
> > In
> > 0.16 all tasks have to necessarily take the transition from
> > RUNNING->COMMIT_PENDING->SUCCEEDED. This behavior has been 
> improved in
> > 0.17
> > (hadoop-3140) to include only tasks that generate output, 
> i.e., a task 
> > is marked as SUCCEEDED if it doesn't generate any output in 
> its output path.
> >
> > Devaraj
> >
> > > -----Original Message-----
> > > From: Lili Wu [mailto:liliwu@gmail.com]
> > > Sent: Thursday, May 01, 2008 2:09 AM
> > > To: core-user@hadoop.apache.org
> > > Cc: samr@ning.com
> > > Subject: OOM error with large # of map tasks
> > >
> > > We are using hadoop 0.16 and are seeing a consistent problem:
> > >  out of memory errors when we have a large # of map tasks.
> > > The specifics of what is submitted when we reproduce this:
> > >
> > > three large jobs:
> > > 1. 20,000 map tasks and 10 reduce tasks 2. 17,000 map 
> tasks and 10 
> > > reduce tasks 3. 10,000 map tasks and 10 reduce tasks
> > >
> > > these are at normal priority and periodically we swap the 
> priorities 
> > > around to get some tasks started by each and let them complete.
> > > other smaller jobs come  and go every hour or so (no more 
> than 200 
> > > map tasks, 4-10 reducers).
> > >
> > > Our cluster consists of 23 nodes and we have 69 map tasks and
> > > 69 reduce tasks.
> > > Eventually, we see consistent oom errors in the task logs and the 
> > > task tracker itself goes down on as many as 14 of our nodes.
> > >
> > > We examined a heap dump after one of these crashes of a 
> TaskTracker 
> > > and found something interesting--there were 572 instances of 
> > > JobConf's that
> > > accounted for 940mb of String objects.   This seems quite odd
> > > that there are
> > > so many instances of JobConf.  It seems to correlate with task in 
> > > the COMMIT_PENDING state as shown on the status for a 
> task tracker 
> > > node.  Has anyone observed something like this?
> > > can anyone explain what would cause tasks to remain in 
> this state? 
> > > (which also apparently is in-memory vs
> > > serialized to disk...).   In general, what does
> > > COMMIT_PENDING mean?  (job
> > > done, but output not committed to dfs?)
> > >
> > > Thanks!
> > >
> >
> >
> 


Mime
View raw message