hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1362) Provide volume management functionality for DataNode
Date Thu, 11 Nov 2010 21:36:16 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931209#action_12931209

Todd Lipcon commented on HDFS-1362:

We had a brief meeting this morning to discuss this JIRA. To summarize for the community:

- Having the ability to add/remove volumes via RPC has the issue that the changes are not
reflected in the config file, so we risk that an admin may add a volume but forgot to modify
the config. The next time the cluster is restarted, the volume will be missing and cause problems.
- We discussed that the primary use case for this feature is restoring a volume after it has
failed. The other use case (adding a new volume to a DN that has not suffered any issues)
is rather rare.
- So, rather than providing add/list/remove APIs, we decided to simply add a "refresh" API.
There were two options suggested here:
1. Make use of the new HADOOP-7001 interface for reconfiguring daemons. In this case an admin
could modify the config file to add new volumes, and then refresh the config to have the DN
pick up new volumes, or re-add failed volumes. The potential issue here is that, even if the
configuration has not changed, we still want the "refresh" to do something, so maybe this
is not the right place.
2. Add a new RPC and command line tool, something like "dfsadmin -restoreDNStorage <datanode
IP:port>". This would not re-read the conf file, but rather just re-check any failed volumes
to see if they are newly available. This could alternatively be triggered by a new DN servlet
or something if it's simpler.

- We also discussed pluggability (HDFS-1405). Tom and I were of the opinion that this feature
is generally useful and don't see any compelling reason to make it a plugin. We should just
improve FSDataset directly instead of extending it into a new java class.
- Regarding the new feature of copying blocks from volume to volume in the case that one volume
has gone read-only, we decided that we should defer this to a separate JIRA to be implemented
after this is complete. That will make this one smaller and easier to review.

> Provide volume management functionality for DataNode
> ----------------------------------------------------
>                 Key: HDFS-1362
>                 URL: https://issues.apache.org/jira/browse/HDFS-1362
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: data-node
>            Reporter: Wang Xu
>            Assignee: Wang Xu
>         Attachments: HDFS-1362.txt, Provide_volume_management_for_DN_v1.pdf
> The current management unit in Hadoop is a node, i.e. if a node failed, it will be kicked
out and all the data on the node will be replicated.
> As almost all SATA controller support hotplug, we add a new command line interface to
datanode, thus it can list, add or remove a volume online, which means we can change a disk
without node decommission. Moreover, if the failed disk still readable and the node has enouth
space, it can migrate data on the disks to other disks in the same node.
> A more detailed design document will be attached.
> The original version in our lab is implemented against 0.20 datanode directly, and is
it better to implemented it in contrib? Or any other suggestion?

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message