hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Kellogg <d...@wink.com>
Subject Re: Job tracker not responding during streaming job
Date Thu, 07 May 2009 01:11:13 GMT
I still see the memory leak in the JobTracker, (version 0.19.0,  
streaming, Java version 1.6). Doubling the heap size simply doubled  
the time-to-failure. I ran hprof against the jobtracker process.

It appears the Counters objects are instantiated many times. The  
stack traces often point back to Counters again.

Can anyone guess what the problem is here?

Dave

SITES BEGIN (ordered by live bytes) Wed May  6 12:33:48 2009
           percent          live          alloc'ed  stack class
  rank   self  accum     bytes objs     bytes  objs trace name
     1  3.51%  3.51%  25115888 46858 255055064 475849 332154  
java.lang.Object[]
     2  3.49%  7.00%  24970632 46587  24970632 46587 332642  
java.lang.Object[]
     3  3.35% 10.35%  23959352 379880  43381832 677145 333210 char[]
     4  3.19% 13.53%  22805816 379875  41290952 677145 333204 char[]
     5  2.48% 16.02%  17782896 370477  22768176 474337 333211  
java.util.HashMap$Entry
     6  2.12% 18.14%  15195200 379880  27085800 677145 333209  
java.lang.String
     7  2.12% 20.26%  15195000 379875  27085800 677145 333203  
java.lang.String
     8  2.07% 22.33%  14819080 370477  18973480 474337 333198  
org.apache.hadoop.mapred.Counters$Counter
     9  1.97% 24.30%  14083560 92655  18559200 122100 333168  
java.util.HashMap$Entry[]
    10  1.54% 25.84%  11006400 45860  11006400 45860 331020  
org.apache.hadoop.mapred.TaskInProgress


TRACE 332154:
     java.util.IdentityHashMap.init(IdentityHashMap.java:244)
     java.util.IdentityHashMap.<init>(IdentityHashMap.java:190)
     org.apache.hadoop.mapred.Counters.<init>(Counters.java:389)
     org.apache.hadoop.mapred.TaskStatus.readFields(TaskStatus.java:351)
     org.apache.hadoop.mapred.TaskStatus.readTaskStatus 
(TaskStatus.java:393)
     org.apache.hadoop.mapred.TaskTrackerStatus.readFields 
(TaskTrackerStatus.java:297)
     org.apache.hadoop.io.ObjectWritable.readObject 
(ObjectWritable.java:237)
     org.apache.hadoop.ipc.RPC$Invocation.readFields(RPC.java:100)
     org.apache.hadoop.ipc.Server$Connection.processData(Server.java: 
845)
     org.apache.hadoop.ipc.Server$Connection.readAndProcess 
(Server.java:812)

TRACE 332642:
     java.util.IdentityHashMap.init(IdentityHashMap.java:244)
     java.util.IdentityHashMap.<init>(IdentityHashMap.java:190)
     org.apache.hadoop.mapred.Counters.<init>(Counters.java:389)
     org.apache.hadoop.mapred.Task.<init>(Task.java:323)
     org.apache.hadoop.mapred.MapTask.<init>(MapTask.java:91)
     org.apache.hadoop.mapred.TaskInProgress.addRunningTask 
(TaskInProgress.java:857)
     org.apache.hadoop.mapred.TaskInProgress.getTaskToRun 
(TaskInProgress.java:844)
     org.apache.hadoop.mapred.JobInProgress.obtainNewMapTask 
(JobInProgress.java:889)
     org.apache.hadoop.mapred.JobQueueTaskScheduler.assignTasks 
(JobQueueTaskScheduler.java:150)
     org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1887)
TRACE 333210:
     java.util.Arrays.copyOfRange(Arrays.java:3209)
     java.lang.String.<init>(String.java:216)
     java.nio.HeapCharBuffer.toString(HeapCharBuffer.java:542)
     java.nio.CharBuffer.toString(CharBuffer.java:1157)
     org.apache.hadoop.io.Text.decode(Text.java:350)
     org.apache.hadoop.io.Text.decode(Text.java:322)
     org.apache.hadoop.io.Text.readString(Text.java:403)
     org.apache.hadoop.mapred.Counters$Counter.readFields 
(Counters.java:90)
     org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java: 
371)
     org.apache.hadoop.mapred.Counters.readFields(Counters.java:554)

TRACE 333204:
     java.util.Arrays.copyOfRange(Arrays.java:3209)
     java.lang.String.<init>(String.java:216)
     java.nio.HeapCharBuffer.toString(HeapCharBuffer.java:542)
     java.nio.CharBuffer.toString(CharBuffer.java:1157)
     org.apache.hadoop.io.Text.decode(Text.java:350)
     org.apache.hadoop.io.Text.decode(Text.java:322)
     org.apache.hadoop.io.Text.readString(Text.java:403)
     org.apache.hadoop.mapred.Counters$Counter.readFields 
(Counters.java:88)
     org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java: 
371)
     org.apache.hadoop.mapred.Counters.readFields(Counters.java:554)

TRACE 333211:
     java.util.HashMap$Entry.<init>(HashMap.java:683)
     java.util.HashMap.addEntry(HashMap.java:753)
     java.util.HashMap.put(HashMap.java:385)
     org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java: 
372)
     org.apache.hadoop.mapred.Counters.readFields(Counters.java:554)
     org.apache.hadoop.mapred.TaskStatus.readFields(TaskStatus.java:355)
     org.apache.hadoop.mapred.TaskStatus.readTaskStatus 
(TaskStatus.java:393)
     org.apache.hadoop.mapred.TaskTrackerStatus.readFields 
(TaskTrackerStatus.java:297)
     org.apache.hadoop.io.ObjectWritable.readObject 
(ObjectWritable.java:237)
     org.apache.hadoop.ipc.RPC$Invocation.readFields(RPC.java:100)

TRACE 333209:
     java.lang.String.<init>(String.java:203)
     java.nio.HeapCharBuffer.toString(HeapCharBuffer.java:542)
     java.nio.CharBuffer.toString(CharBuffer.java:1157)
     org.apache.hadoop.io.Text.decode(Text.java:350)
     org.apache.hadoop.io.Text.decode(Text.java:322)
     org.apache.hadoop.io.Text.readString(Text.java:403)
     org.apache.hadoop.mapred.Counters$Counter.readFields 
(Counters.java:90)
     org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java: 
371)
     org.apache.hadoop.mapred.Counters.readFields(Counters.java:554)
     org.apache.hadoop.mapred.TaskStatus.readFields(TaskStatus.java:355)

TRACE 333203:
     java.lang.String.<init>(String.java:203)
     java.nio.HeapCharBuffer.toString(HeapCharBuffer.java:542)
     java.nio.CharBuffer.toString(CharBuffer.java:1157)
     org.apache.hadoop.io.Text.decode(Text.java:350)
     org.apache.hadoop.io.Text.decode(Text.java:322)
     org.apache.hadoop.io.Text.readString(Text.java:403)
     org.apache.hadoop.mapred.Counters$Counter.readFields 
(Counters.java:88)
     org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java: 
371)
     org.apache.hadoop.mapred.Counters.readFields(Counters.java:554)
     org.apache.hadoop.mapred.TaskStatus.readFields(TaskStatus.java:355)

TRACE 333198:
     org.apache.hadoop.mapred.Counters$Counter.<init>(Counters.java:74)
     org.apache.hadoop.mapred.Counters$Group.readFields(Counters.java: 
370)
     org.apache.hadoop.mapred.Counters.readFields(Counters.java:554)
     org.apache.hadoop.mapred.TaskStatus.readFields(TaskStatus.java:355)
     org.apache.hadoop.mapred.TaskStatus.readTaskStatus 
(TaskStatus.java:393)
     org.apache.hadoop.mapred.TaskTrackerStatus.readFields 
(TaskTrackerStatus.java:297)
     org.apache.hadoop.io.ObjectWritable.readObject 
(ObjectWritable.java:237)
     org.apache.hadoop.ipc.RPC$Invocation.readFields(RPC.java:100)
     org.apache.hadoop.ipc.Server$Connection.processData(Server.java: 
845)
     org.apache.hadoop.ipc.Server$Connection.readAndProcess 
(Server.java:812)

TRACE 333168:
     java.util.HashMap.<init>(HashMap.java:209)
     org.apache.hadoop.mapred.Counters$Group.<init>(Counters.java:195)
     org.apache.hadoop.mapred.Counters.readFields(Counters.java:553)
     org.apache.hadoop.mapred.TaskStatus.readFields(TaskStatus.java:355)
     org.apache.hadoop.mapred.TaskStatus.readTaskStatus 
(TaskStatus.java:393)
     org.apache.hadoop.mapred.TaskTrackerStatus.readFields 
(TaskTrackerStatus.java:297)
     org.apache.hadoop.io.ObjectWritable.readObject 
(ObjectWritable.java:237)
     org.apache.hadoop.ipc.RPC$Invocation.readFields(RPC.java:100)
     org.apache.hadoop.ipc.Server$Connection.processData(Server.java: 
845)
     org.apache.hadoop.ipc.Server$Connection.readAndProcess 
(Server.java:812)

TRACE 331020:
     org.apache.hadoop.mapred.TaskInProgress.<init> 
(TaskInProgress.java:127)
     org.apache.hadoop.mapred.JobInProgress.initTasks 
(JobInProgress.java:389)
     org.apache.hadoop.mapred.EagerTaskInitializationListener 
$JobInitThread.run(EagerTaskInitializationListener.java:55)
     java.lang.Thread.run(Thread.java:619)





On Apr 7, 2009, at 10:50 AM, David Kellogg wrote:

> Thanks.
>
> It is Hadoop 0.19.0. It looks like a peak of 90 concurrent map  
> tasks and 54 concurrent reduce tasks, not necessarily at the same  
> time. You asked how many are held in memory. I think the number is  
> 100, the default.
>
> It looks like 130 MB peak heap size used on the jobtracker. The  
> default 1000 is allocated, currently.
>
> I set mapred.jobtracker.completeuserjobs.maximum=5
> I increased HADOOP_HEAPSIZE to 2000 in hadoop-env.sh.
> I included the "kill -3" dump from the jobtracker, far below.
>
> I'll check back in 3 days, if and when the jobtracker fails again.
>
> Dave
>
>
>
>
> On Apr 6, 2009, at 9:42 PM, Amar Kamat wrote:
>
>> David Kellogg wrote:
>>> I am running Hadoop streaming. After around 42 jobs on an 18-node  
>>> cluster, the jobtracker stops responding. This happens on  
>>> normally-working code. Here are the symptoms.
>>>
>>> 1. A job is running, but it pauses with reduce stuck at XX%
>>> 2. "hadoop job -list" hangs or takes a very long time to return
>>> 3. In the Ganglia metrics on the Jobtracker node:
>>>      a. jvm.metrics__JobTracker__gcTimeMillis rises above 20 k  
>>> (20 seconds) before failure
>>>      b. jvm.metrics__JobTracker__memHeapUsedM rises above 600  
>>> before failure
>>>      c. jvm.metrics__JobTracker__gcCount rises above 1 k before  
>>> failure
>>>
>>>
>>> The ticker looks like this.
>>>
>>> 09/04/06 03:06:28 INFO streaming.StreamJob:  map 24%  reduce 7%
>>> 09/04/06 03:13:44 INFO streaming.StreamJob:  map 25%  reduce 7%
>>> After the 03:13:44 line, it hangs for more than 15 minutes.
>>>
>>> In the jobtracker log, I see this.
>>>
>>> 2009-04-04 04:19:13,563 WARN org.apache.hadoop.hdfs.DFSClient:  
>>> Error Recovery for block blk_-8143535428142072268_95993 failed   
>>> because recovery from primary datanode 10.1.0.156:50010 failed 4  
>>> times. Will retry...
>>>
>>> After restarting both dfs and mapreduce on all nodes, the problem  
>>> goes away, and the formally non-working job proceeds without  
>>> failure.
>> David,
>> What version are you using?
>> There can be because of :
>> 1) Number of tasks in jobtracker's memory might exceed its limits.  
>> What is the total number of tasks in the jobtracker's memory? What  
>> is the jobtracker's heap size? Try increasing the heap size and  
>> also try setting the mapred.jobtracker.completeuserjobs.maximum  
>> parameter to some low value.
>> 2) Sometimes some slow/bad datanode causes the jobtracker to get  
>> stuck. As you have mentioned this might be the cause. Can you let  
>> us know the output of 'kill -3' on jobtracker process.
>>>
>>> Does anyone else see this problem?
>>>
>>> David Kellogg
>>
>
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (10.0-b22 mixed  
> mode):
>
> "LeaseChecker" daemon prio=10 tid=0x00007f0c3042a800 nid=0x3ef3  
> waiting on condition [0x0000000040bf7000..0x0000000040bf7b90]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>         at java.lang.Thread.sleep(Native Method)
>         at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.run 
> (DFSClient.java:979)
>         at java.lang.Thread.run(Thread.java:619)
>
> "IPC Server handler 9 on 54311" daemon prio=10  
> tid=0x00007f0c307f0800 nid=0x3dd9 waiting on condition  
> [0x000000004313f000..0x000000004313fd10]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00007f0c3ca963a8> (a  
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park 
> (LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer 
> $ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
>         at java.util.concurrent.LinkedBlockingQueue.take 
> (LinkedBlockingQueue.java:358)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:878)
>
> "IPC Server handler 8 on 54311" daemon prio=10  
> tid=0x00007f0c307ef400 nid=0x3dd8 waiting on condition  
> [0x000000004303e000..0x000000004303ed90]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00007f0c3ca963a8> (a  
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park 
> (LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer 
> $ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
>         at java.util.concurrent.LinkedBlockingQueue.take 
> (LinkedBlockingQueue.java:358)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:878)
>
> "IPC Server handler 7 on 54311" daemon prio=10  
> tid=0x00007f0c307b0400 nid=0x3dd7 waiting on condition  
> [0x0000000042f3d000..0x0000000042f3da10]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00007f0c3ca963a8> (a  
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park 
> (LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer 
> $ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
>         at java.util.concurrent.LinkedBlockingQueue.take 
> (LinkedBlockingQueue.java:358)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:878)
>
> "IPC Server handler 6 on 54311" daemon prio=10  
> tid=0x00007f0c307af000 nid=0x3dd6 waiting on condition  
> [0x0000000042e3c000..0x0000000042e3ca90]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00007f0c3ca963a8> (a  
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park 
> (LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer 
> $ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
>         at java.util.concurrent.LinkedBlockingQueue.take 
> (LinkedBlockingQueue.java:358)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:878)
>
> "IPC Server handler 5 on 54311" daemon prio=10  
> tid=0x00007f0c30296800 nid=0x3dd5 waiting on condition  
> [0x0000000042d3b000..0x0000000042d3bb10]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00007f0c3ca963a8> (a  
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park 
> (LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer 
> $ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
>         at java.util.concurrent.LinkedBlockingQueue.take 
> (LinkedBlockingQueue.java:358)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:878)
>
> "IPC Server handler 4 on 54311" daemon prio=10  
> tid=0x00007f0c302e7c00 nid=0x3dd4 waiting on condition  
> [0x0000000042c3a000..0x0000000042c3ab90]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00007f0c3ca963a8> (a  
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park 
> (LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer 
> $ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
>         at java.util.concurrent.LinkedBlockingQueue.take 
> (LinkedBlockingQueue.java:358)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:878)
>
> "IPC Server handler 3 on 54311" daemon prio=10  
> tid=0x00007f0c302e6800 nid=0x3dd3 waiting on condition  
> [0x0000000042b39000..0x0000000042b39c10]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00007f0c3ca963a8> (a  
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park 
> (LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer 
> $ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
>         at java.util.concurrent.LinkedBlockingQueue.take 
> (LinkedBlockingQueue.java:358)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:878)
>
> "IPC Server handler 2 on 54311" daemon prio=10  
> tid=0x00007f0c30337400 nid=0x3dd2 waiting on condition  
> [0x0000000042a38000..0x0000000042a38c90]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00007f0c3ca963a8> (a  
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park 
> (LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer 
> $ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
>         at java.util.concurrent.LinkedBlockingQueue.take 
> (LinkedBlockingQueue.java:358)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:878)
>
> "IPC Server handler 1 on 54311" daemon prio=10  
> tid=0x00007f0c30336000 nid=0x3dd1 waiting on condition  
> [0x0000000042937000..0x0000000042937d10]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00007f0c3ca963a8> (a  
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park 
> (LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer 
> $ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
>         at java.util.concurrent.LinkedBlockingQueue.take 
> (LinkedBlockingQueue.java:358)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:878)
>
> "IPC Server handler 0 on 54311" daemon prio=10  
> tid=0x00007f0c302a6000 nid=0x3dd0 waiting on condition  
> [0x0000000042836000..0x0000000042836d90]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00007f0c3ca963a8> (a  
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park 
> (LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer 
> $ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
>         at java.util.concurrent.LinkedBlockingQueue.take 
> (LinkedBlockingQueue.java:358)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:878)
>
> "IPC Server listener on 54311" daemon prio=10  
> tid=0x00007f0c302b6c00 nid=0x3dcf runnable  
> [0x0000000042735000..0x0000000042735a10]
>    java.lang.Thread.State: RUNNABLE
>         at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>         at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java: 
> 215)
>         at sun.nio.ch.EPollSelectorImpl.doSelect 
> (EPollSelectorImpl.java:65)
>         at sun.nio.ch.SelectorImpl.lockAndDoSelect 
> (SelectorImpl.java:69)
>         - locked <0x00007f0c3ca96ac8> (a sun.nio.ch.Util$1)
>         - locked <0x00007f0c3ca96ab0> (a java.util.Collections 
> $UnmodifiableSet)
>         - locked <0x00007f0c3ca96850> (a sun.nio.ch.EPollSelectorImpl)
>         at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
>         at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:84)
>         at org.apache.hadoop.ipc.Server$Listener.run(Server.java:297)
>
> "IPC Server Responder" daemon prio=10 tid=0x00007f0c302b5800  
> nid=0x3dce runnable [0x0000000042634000..0x0000000042634a90]
>    java.lang.Thread.State: RUNNABLE
>         at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>         at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java: 
> 215)
>         at sun.nio.ch.EPollSelectorImpl.doSelect 
> (EPollSelectorImpl.java:65)
>         at sun.nio.ch.SelectorImpl.lockAndDoSelect 
> (SelectorImpl.java:69)
>         - locked <0x00007f0c3cba4810> (a sun.nio.ch.Util$1)
>         - locked <0x00007f0c3cba47f8> (a java.util.Collections 
> $UnmodifiableSet)
>         - locked <0x00007f0c3cba4520> (a sun.nio.ch.EPollSelectorImpl)
>         at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
>         at org.apache.hadoop.ipc.Server$Responder.run(Server.java:456)
>
> "expireLaunchingTasks" prio=10 tid=0x00007f0c308ba400 nid=0x3dcd  
> waiting on condition [0x0000000042533000..0x0000000042533b10]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>         at java.lang.Thread.sleep(Native Method)
>         at org.apache.hadoop.mapred.JobTracker 
> $ExpireLaunchingTasks.run(JobTracker.java:198)
>         at java.lang.Thread.run(Thread.java:619)
>
> "retireJobs" prio=10 tid=0x00007f0c303a8800 nid=0x3dcc waiting on  
> condition [0x0000000042432000..0x0000000042432b90]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>         at java.lang.Thread.sleep(Native Method)
>         at org.apache.hadoop.mapred.JobTracker$RetireJobs.run 
> (JobTracker.java:351)
>         at java.lang.Thread.run(Thread.java:619)
>
> "expireTrackers" prio=10 tid=0x00007f0c36940c00 nid=0x3dcb waiting  
> on condition [0x0000000041c7f000..0x0000000041c7fc10]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>         at java.lang.Thread.sleep(Native Method)
>         at org.apache.hadoop.mapred.JobTracker$ExpireTrackers.run 
> (JobTracker.java:281)
>         at java.lang.Thread.run(Thread.java:619)
>
> "initJobs" prio=10 tid=0x00007f0c3693fc00 nid=0x3dca in Object.wait 
> () [0x0000000041b7e000..0x0000000041b7ec90]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0x00007f0c3ca91d90> (a java.util.ArrayList)
>         at java.lang.Object.wait(Object.java:485)
>         at org.apache.hadoop.mapred.EagerTaskInitializationListener 
> $JobInitThread.run(EagerTaskInitializationListener.java:51)
>         - locked <0x00007f0c3ca91d90> (a java.util.ArrayList)
>         at java.lang.Thread.run(Thread.java:619)
>
> "Thread-14" prio=10 tid=0x00007f0c3602f400 nid=0x3dc9 waiting on  
> condition [0x0000000041a7d000..0x0000000041a7dd10]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00007f0c3cf03938> (a  
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park 
> (LockSupport.java:158)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer 
> $ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
>         at java.util.concurrent.DelayQueue.take(DelayQueue.java:160)
>         at java.util.concurrent.DelayQueue.take(DelayQueue.java:39)
>         at org.apache.hadoop.mapred.JobEndNotifier$1.run 
> (JobEndNotifier.java:50)
>         at java.lang.Thread.run(Thread.java:619)
>
> "Timer thread for monitoring mapred" daemon prio=10  
> tid=0x00007f0c36d51000 nid=0x3dc4 in Object.wait()  
> [0x000000004197c000..0x000000004197cb10]
>    java.lang.Thread.State: TIMED_WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         at java.util.TimerThread.mainLoop(Timer.java:509)
>         - locked <0x00007f0c3ceabad8> (a java.util.TaskQueue)
>         at java.util.TimerThread.run(Timer.java:462)
>
> "Timer thread for monitoring jvm" daemon prio=10  
> tid=0x00007f0c36d86000 nid=0x3dc3 in Object.wait()  
> [0x00000000407b3000..0x00000000407b3b90]
>    java.lang.Thread.State: TIMED_WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         at java.util.TimerThread.mainLoop(Timer.java:509)
>         - locked <0x00007f0c3ceaeaf8> (a java.util.TaskQueue)
>         at java.util.TimerThread.run(Timer.java:462)
>
> "SocketListener0-1" prio=10 tid=0x00007f0c36a4f000 nid=0x3dc2 in  
> Object.wait() [0x000000004187b000..0x000000004187bc10]
>    java.lang.Thread.State: TIMED_WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         at org.mortbay.util.ThreadPool$PoolThread.run 
> (ThreadPool.java:522)
>         - locked <0x00007f0c3ceb1760> (a org.mortbay.util.ThreadPool 
> $PoolThread)
>
> "SocketListener0-0" prio=10 tid=0x00007f0c36d69400 nid=0x3dc1 in  
> Object.wait() [0x000000004177a000..0x000000004177ac90]
>    java.lang.Thread.State: TIMED_WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         at org.mortbay.util.ThreadPool$PoolThread.run 
> (ThreadPool.java:522)
>         - locked <0x00007f0c3ceb1a40> (a org.mortbay.util.ThreadPool 
> $PoolThread)
>
> "Acceptor ServerSocket 
> [addr=0.0.0.0/0.0.0.0,port=0,localport=50030]" prio=10  
> tid=0x00007f0c36b6a800 nid=0x3dc0 runnable  
> [0x0000000041679000..0x0000000041679d10]
>    java.lang.Thread.State: RUNNABLE
>         at java.net.PlainSocketImpl.socketAccept(Native Method)
>         at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)
>         - locked <0x00007f0c3cc3b5c8> (a java.net.SocksSocketImpl)
>         at java.net.ServerSocket.implAccept(ServerSocket.java:453)
>         at java.net.ServerSocket.accept(ServerSocket.java:421)
>         at org.mortbay.util.ThreadedServer.acceptSocket 
> (ThreadedServer.java:432)
>         at org.mortbay.util.ThreadedServer$Acceptor.run 
> (ThreadedServer.java:631)
>
> "SessionScavenger" daemon prio=10 tid=0x00007f0c303cc000 nid=0x3dbf  
> waiting on condition [0x0000000040e0b000..0x0000000040e0bd90]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>         at java.lang.Thread.sleep(Native Method)
>         at org.mortbay.jetty.servlet.AbstractSessionManager 
> $SessionScavenger.run(AbstractSessionManager.java:587)
>
> "SessionScavenger" daemon prio=10 tid=0x00007f0c30346800 nid=0x3dbe  
> waiting on condition [0x0000000041de8000..0x0000000041de8a10]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>         at java.lang.Thread.sleep(Native Method)
>         at org.mortbay.jetty.servlet.AbstractSessionManager 
> $SessionScavenger.run(AbstractSessionManager.java:587)
>
> "SessionScavenger" daemon prio=10 tid=0x00007f0c3031e400 nid=0x3dbd  
> waiting on condition [0x0000000041578000..0x0000000041578a90]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>         at java.lang.Thread.sleep(Native Method)
>         at org.mortbay.jetty.servlet.AbstractSessionManager 
> $SessionScavenger.run(AbstractSessionManager.java:587)
>
> "Low Memory Detector" daemon prio=10 tid=0x00000000401c9400  
> nid=0x3d7c runnable [0x0000000000000000..0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
>
> "CompilerThread1" daemon prio=10 tid=0x00000000401c6000 nid=0x3d7b  
> waiting on condition [0x0000000000000000..0x00000000423302b0]
>    java.lang.Thread.State: RUNNABLE
>
> "CompilerThread0" daemon prio=10 tid=0x00000000401c4000 nid=0x3d7a  
> waiting on condition [0x0000000000000000..0x000000004222f330]
>    java.lang.Thread.State: RUNNABLE
>
> "Signal Dispatcher" daemon prio=10 tid=0x00000000401c2800  
> nid=0x3d79 waiting on condition  
> [0x0000000000000000..0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
>
> "Finalizer" daemon prio=10 tid=0x000000004019d800 nid=0x3d78 in  
> Object.wait() [0x000000004202e000..0x000000004202eb90]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java: 
> 116)
>         - locked <0x00007f0c3cbfe4e0> (a  
> java.lang.ref.ReferenceQueue$Lock)
>         at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java: 
> 132)
>         at java.lang.ref.Finalizer$FinalizerThread.run 
> (Finalizer.java:159)
>
> "Reference Handler" daemon prio=10 tid=0x0000000040196c00  
> nid=0x3d77 in Object.wait() [0x0000000041f2d000..0x0000000041f2dc10]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         at java.lang.Object.wait(Object.java:485)
>         at java.lang.ref.Reference$ReferenceHandler.run 
> (Reference.java:116)
>         - locked <0x00007f0c3caf1408> (a java.lang.ref.Reference$Lock)
>
> "main" prio=10 tid=0x0000000040111c00 nid=0x3d73 in Object.wait()  
> [0x0000000041477000..0x0000000041477ed0]
>    java.lang.Thread.State: WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         - waiting on <0x00007f0c3cba4020> (a  
> org.apache.hadoop.ipc.RPC$Server)
>         at java.lang.Object.wait(Object.java:485)
>         at org.apache.hadoop.ipc.Server.join(Server.java:1021)
>         - locked <0x00007f0c3cba4020> (a org.apache.hadoop.ipc.RPC 
> $Server)
>         at org.apache.hadoop.mapred.JobTracker.offerService 
> (JobTracker.java:1309)
>         at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java: 
> 2731)
>
> "VM Thread" prio=10 tid=0x0000000040191c00 nid=0x3d76 runnable
>
> "GC task thread#0 (ParallelGC)" prio=10 tid=0x000000004011c400  
> nid=0x3d74 runnable
>
> "GC task thread#1 (ParallelGC)" prio=10 tid=0x000000004011d800  
> nid=0x3d75 runnable
>
> "VM Periodic Task Thread" prio=10 tid=0x00000000401cb000 nid=0x3d7d  
> waiting on condition
>
> JNI global references: 964
>
> Heap
>  PSYoungGen      total 198528K, used 68487K [0x00007f0c66540000,  
> 0x00007f0c77830000, 0x00007f0c7b290000)
>   eden space 142144K, 48% used  
> [0x00007f0c66540000,0x00007f0c6a821f70,0x00007f0c6f010000)
>   from space 56384K, 0% used  
> [0x00007f0c6f010000,0x00007f0c6f010000,0x00007f0c72720000)
>   to   space 57600K, 0% used  
> [0x00007f0c73ff0000,0x00007f0c73ff0000,0x00007f0c77830000)
>  PSOldGen        total 682688K, used 672276K [0x00007f0c3ca90000,  
> 0x00007f0c66540000, 0x00007f0c66540000)
>   object space 682688K, 98% used  
> [0x00007f0c3ca90000,0x00007f0c65b15068,0x00007f0c66540000)
>  PSPermGen       total 21248K, used 16555K [0x00007f0c37690000,  
> 0x00007f0c38b50000, 0x00007f0c3ca90000)
>   object space 21248K, 77% used  
> [0x00007f0c37690000,0x00007f0c386bae60,0x00007f0c38b50000)
>


Mime
View raw message