hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghu Angadi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1312) heartbeat monitor thread goes away
Date Wed, 02 May 2007 19:37:15 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12493184
] 

Raghu Angadi commented on HADOOP-1312:
--------------------------------------

Another minor change:

Also since this patch catches all exceptions inside couple of threads (just like other threads),
could we log the exceptions at error level instead of info? This way we can differentiate
these unexpected exceptions from other expected ones while grepping the logs.


> heartbeat monitor thread goes away
> ----------------------------------
>
>                 Key: HADOOP-1312
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1312
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>         Assigned To: dhruba borthakur
>            Priority: Blocker
>         Attachments: heartbeatmonitor.patch, heartbeatmonitor2.patch
>
>
> The heartbeat monitor thread encounters a ConcurrentModificationException while iterating
over the "heartbeats" data structure. This occurs when the namenode was getting restarted.
There are actuallt two bugs here:
> 1. The Heartbeat Monitor thread needs to catch Exceptions and continue, instead of exiting.
> 2. The heartbeats data structures is protected by the heartbeats lock. The registerDatanode()
method invokes removeDatanode() without acquiring the heartbeats monitor lock. This causes
the ConcurrentModificationException.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message