hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1540) Make Datanode handle errors to namenode.register call more elegantly
Date Tue, 28 Dec 2010 21:23:46 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12975630#action_12975630

Konstantin Shvachko commented on HDFS-1540:

>From the stack trace I understand you restarted the name-node, which caused failure of
DNs that were registering.
May I propose a variant, which easier than commenting on every line.
} catch(RemoteException re) {
  IOException ue = re.unwrapRemoteException(
  if(ue != re) throw ue;
} catch(IOException e) {  // namenode cannot be contacted
  LOG.info("Problem connecting to server: " + getNameNodeAddr(), e);
You need to {{throw}} rather than {{break}} in order to avoid the registration logic after
the loop.
It might be useful to write a test case for that.

> Make Datanode handle errors to namenode.register call more elegantly
> --------------------------------------------------------------------
>                 Key: HDFS-1540
>                 URL: https://issues.apache.org/jira/browse/HDFS-1540
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: datanodeException1.txt, datanodeException2.txt
> When a datanode receives a "Connection reset by peer" from the namenode.register(), it
exits. This causes many datanodes to die.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message