hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Egorov <vladimir.ego...@oracle.com>
Subject after 2 weeks TaskTracker gets hung with 100% CPU consumption
Date Fri, 20 Apr 2012 18:50:27 GMT
Hi,

After around 2 weeks a TestTracker (TT) in our MR cluster gets hung with 
100% CPU consumption. Most of the times no new tasks are sent to the 
node. We start getting more job failure in the cluster when this 
happens. Once we restart the TT the node is fine for around another two 
weeks.

We also noticed that after restart some other TT in the cluster starts 
having the same behavior. This continues till all the TTs have been 
restarted. Another solution is to restart the MR cluster.

A thread dump is posted below. It looks like TT is busy with some log 
cleanup. We also noticed that when we restart, sometimes TT fails to 
start because tobedeleted directory cannot be deleted. We have to delete 
it manually, and then TT starts normally.

Has anyone seen this and is there a resolution or workaround.

Thank you,
Vladimir

Full thread dump Java HotSpot(TM) 64-Bit Server VM (19.0-b09 mixed mode):

"Thread-97182" daemon prio=10 tid=0x00002aaab8a7f000 nid=0x1c7d runnable 
[0x0000000040508000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=0000000040508000>]


    java.lang.Thread.State: RUNNABLE
     at java.lang.StringCoding$StringEncoder.encode(StringCoding.java:232)
     at java.lang.StringCoding.encode(StringCoding.java:272)
     at java.lang.String.getBytes(String.java:946)
     at java.io.UnixFileSystem.list(Native Method)
     at java.io.File.list(File.java:973)
     at java.io.File.listFiles(File.java:1051)
     at org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:96)
     at org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:84)
     at 
org.apache.hadoop.fs.FileUtil.fullyDeleteContents(FileUtil.java:115)
     at org.apache.hadoop.fs.FileUtil.fullyDelete(FileUtil.java:84)
     at 
org.apache.hadoop.fs.RawLocalFileSystem.delete(RawLocalFileSystem.java:293)
     at 
org.apache.hadoop.fs.ChecksumFileSystem.delete(ChecksumFileSystem.java:466)
     at 
org.apache.hadoop.mapreduce.util.MRAsyncDiskService$DeleteTask.run(MRAsyncDiskService.java:199)
     at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
     at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
     at java.lang.Thread.run(Thread.java:662)

"Thread-97171" daemon prio=10 tid=0x00002aaab8a81000 nid=0x1bde waiting 
for monitor entry [0x000000004030 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=000000004030>a000]


    java.lang.Thread.State: BLOCKED (on object monitor)
     at 
org.apache.hadoop.mapred.TaskTracker.getTaskTrackerReportAddress(TaskTracker.java:1351)
     - waiting to lock<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c185f690>
 
(a org.apache.hadoop.mapred.TaskTracker)
     at org.apache.hadoop.mapred.TaskRunner.getVMArgs(TaskRunner.java:477)
     at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:210)

"Thread-6" daemon prio=10 tid=0x00002aaab443e800 nid=0x2a98 runnable 
[0x0000000043047000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=0000000043047000>]


    java.lang.Thread.State: RUNNABLE
     at java.lang.String.substring(String.java:1939)
     at java.lang.String.substring(String.java:1904)
     at java.io.File.getName(File.java:401)
     at 
java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:229)
     at java.io.File.exists(File.java:733)
     at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:420)
     at org.apache.hadoop.fs.FileSystem.isDirectory(FileSystem.java:964)
     at 
org.apache.hadoop.fs.ChecksumFileSystem.rename(ChecksumFileSystem.java:430)
     at 
org.apache.hadoop.mapreduce.util.MRAsyncDiskService.moveAndDeleteRelativePath(MRAsyncDiskService.java:244)
     at 
org.apache.hadoop.mapreduce.util.MRAsyncDiskService.moveAndDeleteAbsolutePath(MRAsyncDiskService.java:361)
     at 
org.apache.hadoop.mapred.UserLogCleaner.deleteLogPath(UserLogCleaner.java:200)
     at 
org.apache.hadoop.mapred.UserLogCleaner.processCompletedJobs(UserLogCleaner.java:103)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c18b0200>
 
(a java.util.Collections$SynchronizedMap)
     at org.apache.hadoop.mapred.UserLogCleaner.run(UserLogCleaner.java:83)

"Directory/File cleanup thread" daemon prio=10 tid=0x00002aaab443c800 
nid=0x2a97 waiting on condition [0x0000000042 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=0000000042>f46000]


    java.lang.Thread.State: WAITING (parking)
     at sun.misc.Unsafe.park(Native Method)
     - parking to wait for<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c18c9b98>
 
(a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
     at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
     at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
     at 
org.apache.hadoop.mapred.CleanupQueue$PathCleanupThread.run(CleanupQueue.java:130)

"taskCleanup" daemon prio=10 tid=0x00002aaab443c000 nid=0x2a96 waiting 
for monitor entry [0x0000000042 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=0000000042>e45000]


    java.lang.Thread.State: BLOCKED (on object monitor)
     at 
org.apache.hadoop.mapred.TaskTracker.purgeJob(TaskTracker.java:1892)
     - waiting to lock<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c18afb88>
 
(a java.util.TreeMap)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c185f690>
 
(a org.apache.hadoop.mapred.TaskTracker)
     at org.apache.hadoop.mapred.TaskTracker$1.run(TaskTracker.java:398)
     at java.lang.Thread.run(Thread.java:662)

"TaskLauncher for REDUCE tasks" daemon prio=10 tid=0x00002aaab4438800 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=4438800> 
nid=0x2a95 in Object.wait() [0x0000000042 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=0000000042>c43000]


    java.lang.Thread.State: WAITING (on object monitor)
     at java.lang.Object.wait(Native Method)
     - waiting on<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c185f660>
 
(a java.util.LinkedList)
     at java.lang.Object.wait(Object.java:485)
     at 
org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:2157)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c185f660>
 
(a java.util.LinkedList)

"TaskLauncher for MAP tasks" daemon prio=10 tid=0x00002aaab4431800 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=4431800> 
nid=0x2a94 waiting on condition [0x0000000042 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=0000000042>d43000]


    java.lang.Thread.State: RUNNABLE
     at java.util.HashMap.newKeyIterator(HashMap.java:840)
     at java.util.HashMap$KeySet.iterator(HashMap.java:874)
     at java.util.HashSet.iterator(HashSet.java:153)
     at 
java.util.AbstractCollection.containsAll(AbstractCollection.java:276)
     at java.util.AbstractSet.equals(AbstractSet.java:78)
     at java.util.Collections$SynchronizedSet.equals(Collections.java:1655)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>ed5f12c0>
 
(a java.util.Collections$SynchronizedSet)
     at javax.security.auth.Subject.equals(Subject.java:773)
     at 
org.apache.hadoop.security.UserGroupInformation.equals(UserGroupInformation.java:698)
     at 
org.apache.hadoop.fs.FileSystem$Cache$Key.isEqual(FileSystem.java:1878)
     at 
org.apache.hadoop.fs.FileSystem$Cache$Key.equals(FileSystem.java:1888)
     at java.util.HashMap.put(HashMap.java:376)
     at 
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1781)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c18537b8>
 
(a org.apache.hadoop.fs.FileSystem$Cache)
     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1750)
     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:234)
     at org.apache.hadoop.fs.Path.getFileSystem(Path.java:189)
     at org.apache.hadoop.mapred.TaskTracker$3.run(TaskTracker.java:1006)
     at org.apache.hadoop.mapred.TaskTracker$3.run(TaskTracker.java:1004)
     at java.security.AccessController.doPrivileged(Native Method)
     at javax.security.auth.Subject.doAs(Subject.java:396)
     at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742)
     at org.apache.hadoop.mapred.TaskTracker.getFS(TaskTracker.java:1003)
     at 
org.apache.hadoop.mapred.TaskTracker.localizeJobConfFile(TaskTracker.java:1098)
     at 
org.apache.hadoop.mapred.TaskTracker.localizeJobFiles(TaskTracker.java:1048)
     at 
org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:977)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>ed5e8d08>
 
(a org.apache.hadoop.mapred.TaskTracker$RunningJob)
     at 
org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2247)
     at 
org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:2212)

"Map-events fetcher for all reduce tasks on 
tracker_adc00bzu.us.oracle.com:localhost.localdomain/127.0.0.1:43784" 
daemon prio=10 tid=0x00002aaab4411800 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=4411800> 
nid=0x2a8a waiting for monitor entry [0x0000000042 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=0000000042>b42000]


    java.lang.Thread.State: BLOCKED (on object monitor)
     at 
org.apache.hadoop.mapred.TaskTracker$MapEventsFetcherThread.reducesInShuffle(TaskTracker.java:799)
     - waiting to lock<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>ed5e8d08>
 
(a org.apache.hadoop.mapred.TaskTracker$RunningJob)
     at 
org.apache.hadoop.mapred.TaskTracker$MapEventsFetcherThread.run(TaskTracker.java:834)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c18afb88>
 
(a java.util.TreeMap)

"Thread-14" prio=10 tid=0x00002aaab440d000 nid=0x2a88 waiting on 
condition [0x0000000042940000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=0000000042940000>]


    java.lang.Thread.State: TIMED_WAITING (sleeping)
     at java.lang.Thread.sleep(Native Method)
     at 
org.apache.hadoop.mapreduce.filecache.TrackerDistributedCacheManager$CleanupThread.run(TrackerDistributedCacheManager.java:892)

"IPC Server handler 3 on 43784" daemon prio=10 tid=0x00002aaab440b000 
nid=0x2a87 waiting on condition [0x000000004283 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=000000004283>f000]


    java.lang.Thread.State: WAITING (parking)
     at sun.misc.Unsafe.park(Native Method)
     - parking to wait for<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c1874508

<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=1874508>>  
(a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
     at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
     at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1326)

"IPC Server handler 2 on 43784" daemon prio=10 tid=0x00002aaab4409000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=4409000> 
nid=0x2a86 waiting on condition [0x000000004273 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=000000004273>e000]


    java.lang.Thread.State: WAITING (parking)
     at sun.misc.Unsafe.park(Native Method)
     - parking to wait for<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c1874508

<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=1874508>>  
(a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
     at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
     at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1326)

"IPC Server handler 1 on 43784" daemon prio=10 tid=0x00002aaab43eb800 
nid=0x2a85 waiting on condition [0x000000004263 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=000000004263>d000]


    java.lang.Thread.State: WAITING (parking)
     at sun.misc.Unsafe.park(Native Method)
     - parking to wait for<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c1874508

<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=1874508>>  
(a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
     at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
     at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1326)

"IPC Server handler 0 on 43784" daemon prio=10 tid=0x00002aaab43ea800 
nid=0x2a84 waiting on condition [0x000000004253 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=000000004253>c000]


    java.lang.Thread.State: WAITING (parking)
     at sun.misc.Unsafe.park(Native Method)
     - parking to wait for<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c1874508

<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=1874508>>  
(a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
     at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
     at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1326)

"IPC Server listener on 43784" daemon prio=10 tid=0x00002aaab437c000 
nid=0x2a83 runnable [0x000000004243 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=000000004243>b000]


    java.lang.Thread.State: RUNNABLE
     at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
     at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
     at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
     at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c18758d8>
 
(a sun.nio.ch.Util$2)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c18758c8>
 
(a java.util.Collections$UnmodifiableSet)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c18754a8>
 
(a sun.nio.ch.EPollSelectorImpl)
     at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
     at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:84)
     at org.apache.hadoop.ipc.Server$Listener.run(Server.java:426)

"IPC Server Responder" daemon prio=10 tid=0x00002aaab42b6000 nid=0x2a82 
runnable [0x000000004233 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=000000004233>a000]


    java.lang.Thread.State: RUNNABLE
     at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
     at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
     at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
     at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c1876418

<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=1876418>>  
(a sun.nio.ch.Util$2)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c1876408

<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=1876408>>  
(a java.util.Collections$UnmodifiableSet)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c18761f0>
 
(a sun.nio.ch.EPollSelectorImpl)
     at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
     at org.apache.hadoop.ipc.Server$Responder.run(Server.java:593)

"pool-3-thread-1" prio=10 tid=0x00002aaab4289000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=4289000> 
nid=0x2a81 runnable [0x0000000042239000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=0000000042239000>]


    java.lang.Thread.State: RUNNABLE
     at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
     at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
     at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
     at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c1875078

<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=1875078>>  
(a sun.nio.ch.Util$2)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c1875068

<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=1875068>>  
(a java.util.Collections$UnmodifiableSet)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c1874e40>
 
(a sun.nio.ch.EPollSelectorImpl)
     at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
     at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:84)
     at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:321)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c1875ae8>
 
(a org.apache.hadoop.ipc.Server$Listener$Reader)
     at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
     at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
     at java.lang.Thread.run(Thread.java:662)

"pool-2-thread-1" prio=10 tid=0x00002aaab4382000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=4382000> 
nid=0x2a80 runnable [0x0000000042138000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=0000000042138000>]


    java.lang.Thread.State: RUNNABLE
     at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
     at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
     at java.nio.charset.CharsetEncoder.encode(CharsetEncoder.java:760)
     at org.apache.hadoop.io.Text.encode(Text.java:396)
     at org.apache.hadoop.io.Text.set(Text.java:186)
     at org.apache.hadoop.io.Text.<init>(Text.java:89)
     at 
org.apache.hadoop.mapred.TIETaskTrackerInst$NodeInfoCollector.run(TIETaskTrackerInst.java:88)
     at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
     at 
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
     at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
     at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
     at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
     at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
     at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
     at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
     at java.lang.Thread.run(Thread.java:662)

"pool-1-thread-1" prio=10 tid=0x00002aaab4281800 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=4281800> 
nid=0x2a7f waiting on condition [0x0000000040 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=0000000040>fd7000]


    java.lang.Thread.State: TIMED_WAITING (parking)
     at sun.misc.Unsafe.park(Native Method)
     - parking to wait for<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c1867008

<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=1867008>>  
(a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
     at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
     at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)
     at java.util.concurrent.DelayQueue.take(DelayQueue.java:164)
     at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:609)
     at 
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:602)
     at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)
     at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
     at java.lang.Thread.run(Thread.java:662)

"Timer-0" daemon prio=10 tid=0x00002aaab42a0000 nid=0x2a7d in 
Object.wait() [0x0000000040 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=0000000040>b6c000]


    java.lang.Thread.State: TIMED_WAITING (on object monitor)
     at java.lang.Object.wait(Native Method)
     - waiting on<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c187e998>
 
(a java.util.TaskQueue)
     at java.util.TimerThread.mainLoop(Timer.java:509)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c187e998>
 
(a java.util.TaskQueue)
     at java.util.TimerThread.run(Timer.java:462)

"738807903@qtp0-0 - Acceptor0 
SelectChannelConnector@0.0.0.0:50060"prio=10 tid=0x00002aaab4293800 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=4293800> 
nid=0x2a7c runnable [0x0000000042037000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=0000000042037000>]


    java.lang.Thread.State: RUNNABLE
     at java.util.HashMap.newKeyIterator(HashMap.java:840)
     at java.util.HashMap$KeySet.iterator(HashMap.java:874)
     at java.util.HashSet.iterator(HashSet.java:153)
     at 
sun.nio.ch.SelectorImpl.processDeregisterQueue(SelectorImpl.java:127)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c18ae5c8>
 
(a java.util.HashSet)
     at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:69)
     at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c18aea00>
 
(a sun.nio.ch.Util$2)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c18ae9f0>
 
(a java.util.Collections$UnmodifiableSet)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c18ae570>
 
(a sun.nio.ch.EPollSelectorImpl)
     at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
     at 
org.mortbay.io.nio.SelectorManager$SelectSet.doSelect(SelectorManager.java:429)
     at 
org.mortbay.io.nio.SelectorManager.doSelect(SelectorManager.java:185)
     at 
org.mortbay.jetty.nio.SelectChannelConnector.accept(SelectChannelConnector.java:124)
     at 
org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:707)
     at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)

"Low Memory Detector" daemon prio=10 tid=0x000000005 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=000000005>d0d8800

nid=0x2a79 runnable [0x0000000000000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=0000000000000000>]


    java.lang.Thread.State: RUNNABLE

"CompilerThread1" daemon prio=10 tid=0x000000005 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=000000005>d0d6800

nid=0x2a78 waiting on condition [0x0000000000000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=0000000000000000>]


    java.lang.Thread.State: RUNNABLE

"CompilerThread0" daemon prio=10 tid=0x000000005 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=000000005>d0d0800

nid=0x2a77 waiting on condition [0x0000000000000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=0000000000000000>]


    java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x000000005 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=000000005>d0ce800

nid=0x2a76 runnable [0x0000000000000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=0000000000000000>]


    java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x000000005 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=000000005>d0aa800

nid=0x2a75 in Object.wait() [0x0000000041 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=0000000041>a5b000]


    java.lang.Thread.State: WAITING (on object monitor)
     at java.lang.Object.wait(Native Method)
     - waiting on<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c18ca210>
 
(a java.lang.ref.ReferenceQueue$Lock)
     at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c18ca210>
 
(a java.lang.ref.ReferenceQueue$Lock)
     at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
     at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)

"Reference Handler" daemon prio=10 tid=0x000000005 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=000000005>d0a8800

nid=0x2a74 in Object.wait() [0x000000004195 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=000000004195>a000]


    java.lang.Thread.State: WAITING (on object monitor)
     at java.lang.Object.wait(Native Method)
     - waiting on<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c18000b0>
 
(a java.lang.ref.Reference$Lock)
     at java.lang.Object.wait(Object.java:485)
     at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
     - locked<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c18000b0>
 
(a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x000000005 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=000000005>d04a800

nid=0x2a70 waiting for monitor entry [0x0000000040666000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=0000000040666000>]


    java.lang.Thread.State: BLOCKED (on object monitor)
     at 
org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1533)
     - waiting to lock<0x00000000 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=00000000>c185f690>
 
(a org.apache.hadoop.mapred.TaskTracker)
     at 
org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1432)
     at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:2329)
     at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3461)

"VM Thread" prio=10 tid=0x000000005 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=000000005>d0a4000

nid=0x2a73 runnable

"GC task thread#0 (ParallelGC)" prio=10 tid=0x000000005 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=000000005>d05d800

nid=0x2a71 runnable

"GC task thread#1 (ParallelGC)" prio=10 tid=0x000000005 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=000000005>d05f800

nid=0x2a72 runnable

"VM Periodic Task Thread" prio=10 tid=0x000000005 
<https://bug.oraclecorp.com/pls/bug/webbug_edit.edit_info_top?rptno=000000005>d0e3000

nid=0x2a7a waiting on condition

JNI global references: 1519



Mime
View raw message