hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Koji Noguchi (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-2128) Hang JobTracker, running out of memory
Date Tue, 30 Oct 2007 23:02:50 GMT
Hang JobTracker, running out of memory
--------------------------------------

                 Key: HADOOP-2128
                 URL: https://issues.apache.org/jira/browse/HADOOP-2128
             Project: Hadoop
          Issue Type: Bug
          Components: mapred
    Affects Versions: 0.14.3
            Reporter: Koji Noguchi


This may be expected.

Hang JobTracker with 1G heapsize, top showed 99% cpu. 

Ran about 80 jobs.  Each with 2500 mappers 200 reducers.  They finish quite fast.  3-4 mins
avg per job.
(200k tasks)


How much memory does JobTracker use for 'completed'  (but not expired) jobs ?

jmap -heap showed 
{noformat} 
...
PS Old Generation
   capacity = 932118528 (888.9375MB)
   used     = 932118528 (888.9375MB)
...
{noformat} 

jmap -histo showed 
{noformat} 
num   #instances    #bytes  class name
--------------------------------------
  1:   3974182   355869992  [C
  2:   5216606   125198544  java.lang.String
  3:   2238560   107450880  java.util.TreeMap
  4:    463206   101673488  [B
  5:   1979995    63359840  java.util.TreeMap$Entry
  6:    248400    35769600  org.apache.hadoop.mapred.TaskInProgress
  7:    308803    30898112  [Ljava.lang.Object;
  8:    978240    23477760  org.apache.hadoop.mapred.Counters$CounterRec
  9:    249876    19990080  org.apache.hadoop.mapred.TaskStatus
 10:    248836    19906880  java.net.URI
 11:    230337    16584264  org.apache.hadoop.mapred.MapTask
...
{noformat} 

Log showing many heartbeat discarded messages
{noformat} 
2007-10-30 22:55:46,912 WARN org.apache.hadoop.ipc.Server: IPC Server handler 6 on 58567,
call heartbeat(org.apache.hadoop.mapred.TaskTrackerStatus@1afb9c9, false, true, 3942) from
99.99.99.99:9999 discarded for being too old (2578616)
{noformat} 

Is the solution either to increase the jobtracker heapsize or set shorter 'mapred.userlog.retain.hours'
 ?




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message