hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-2492) ConcurrentModificationException in org.apache.hadoop.ipc.Server.Responder
Date Fri, 28 Dec 2007 05:14:43 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554653
] 

Devaraj Das commented on HADOOP-2492:
-------------------------------------

There is no stack trace since the Responder.run catches the exception and just logs the exception
(_LOG.warn("Exception in Responder " + e)_). It doesn't print the stack trace...

> ConcurrentModificationException in org.apache.hadoop.ipc.Server.Responder
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-2492
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2492
>             Project: Hadoop
>          Issue Type: Bug
>          Components: ipc
>    Affects Versions: 0.16.0
>            Reporter: Devaraj Das
>            Assignee: dhruba borthakur
>             Fix For: 0.16.0
>
>
> I was running hadoop on 800 machines and after running a couple of jobs, and running
100% of the maps of the current job, the JobTracker stopped responding - *all* tasktrackers
were lost ... When I looked at the JT logs, these seemed alarming:
> 2007-12-26 19:18:30,185 WARN org.apache.hadoop.ipc.Server: Exception in Responder java.util.ConcurrentModificationException
> Following the above exception, I saw a whole lot of exceptions like:
> 2007-12-26 19:23:10,926 WARN org.apache.hadoop.ipc.Server: Call queue overflow discarding
oldest call heartbeat(org.apache.hadoop.mapred.TaskTrackerStatus@5a05f9, false, true, 1758)
from 1.2.3.4:1234
> From the number of exceptions to do with call queue overflow, it seemed like the jobtracker
was not processing RPCs after it got the ConcurrentModificationException, and around that
time the tasktrackers started getting timeouts on RPCs...
> There were two occurrences of the ConcurrentModificationException but the first instance
seemed to not have any effect on the call queue...  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message