hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brandon Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5583) Make DN send an OOB Ack on shutdown before restaring
Date Wed, 19 Feb 2014 18:22:22 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13905797#comment-13905797
] 

Brandon Li commented on HDFS-5583:
----------------------------------

Some early comments. I haven't finish viewing all the changes.
- In DataNode#shutdownDatanode() can be called only once, and throws exception for the next
invocations.
I would imagine that after administrator issues "dfsadmin shutdownDatanode -upgrade"command,
he/she would like to know if the DataNodes received it and if they are in upgrade preparation
state. Unless I missed something, it seems the only way to know it is to issue the same command
again and expect to receive an exception. Would it be better to either let shutdownDatanode
return an error code or have getDataNodeInfo include current datanode state?

- Do we plan to have more OOB Ack anytime soon? We can always add new enums instead of reserving
a few OOB_RESERVEDx for now. 

- In DataNode.java: is "forUpgrade", "upgrade" or "shutdownForUpgrade" a better name than
the variable name "restarting"? :-)

- DataXceiverServer.java: please clean the unused import


> Make DN send an OOB Ack on shutdown before restaring
> ----------------------------------------------------
>
>                 Key: HDFS-5583
>                 URL: https://issues.apache.org/jira/browse/HDFS-5583
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>         Attachments: HDFS-5583.patch, HDFS-5583.patch, HDFS-5583.patch
>
>
> Add an ability for data nodes to send an OOB response in order to indicate an upcoming
upgrade-restart. Client should ignore the pipeline error from the node for a configured amount
of time and try reconstruct the pipeline without excluding the restarted node.  If the node
does not come back in time, regular pipeline recovery should happen.
> This feature is useful for the applications with a need to keep blocks local. If the
upgrade-restart is fast, the wait is preferable to losing locality.  It could also be used
in general instead of the draining-writer strategy.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message