hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1850) DN should transmit absolute failed volume count rather than increments to the NN
Date Thu, 28 Apr 2011 02:26:03 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026074#comment-13026074
] 

Todd Lipcon commented on HDFS-1850:
-----------------------------------

Some small comments:
- {{getNumFailedVols}} - probably best not to abbreviate vols, since most of the time we use
the full word: {{getNumFailedVolumes}}
- in {{errorReport(...)}} we seem to log twice in the case that there is a disk error. Maybe
the first LOG.info should get moved inside the if statement, and "msg" should be included
in the warns?

Also, should make this Patch Available to run through Hudson.

> DN should transmit absolute failed volume count rather than increments to the NN
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-1850
>                 URL: https://issues.apache.org/jira/browse/HDFS-1850
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node, name-node
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>             Fix For: 0.22.0, 0.23.0
>
>         Attachments: hdfs-1850-1.patch, hdfs-1850-2.patch
>
>
> The API added in HDFS-811 for the DN to report volume failures to the NN is "inc(DN)".
However the given sequence of events will result in the NN forgetting about reported failed
volumes:
> # DN loses a volume and reports it
> # NN restarts
> # DN re-registers to the new NN
> A more robust interface would be to have the DN report the total number of volume failures
to the NN each heart beat (the same way other volume state is transmitted).
> This will likely be an incompatible change since it requires changing the Datanode protocol.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message