hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2701) Cleanup FS* processIOError methods
Date Sun, 18 Dec 2011 01:37:30 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13171747#comment-13171747

Todd Lipcon commented on HDFS-2701:

- The behavior of exitIfInvalidStreams is extremely counter-intuitive... why don't you change
it to check for an empty list, and just change the call site to call _after_ the errored dir
is removed?


In removeEditsAndStorageDir:
+    editStreams.remove(idx);
+    fsimage.removeStorageDir(getStorageDirForStream(idx));
I don't think this is correct -- because getStorageDirForStream is called after it's removed
from editStreams, it will remove the one that came _after_ the stream in the storage dir list
(or throw an ArrayIndexOutOfBounds if it was the last stream)

In {{removeEditsStreamsAndStorageDirs}}, you can use a foreach loop instead of indexed iteration:
+    for (int i = 0; i < errorStreams.size(); i++) {
+      int idx = editStreams.indexOf(errorStreams.get(i));

+        FSNamesystem.LOG.error("Unable to sync edit log");
we should probably include the path of the failed stream here

+          throw new IOException(
+              "Inconsistent existance of edits.new " + editsNew);
spelling error - should be "existence"


What's the test plan for this, HDFS-2702, and HDFS-2703? I agree its buggy but we should articulate
a way to make sure we fixed all the issues and didn't introduce new ones.

> Cleanup FS* processIOError methods
> ----------------------------------
>                 Key: HDFS-2701
>                 URL: https://issues.apache.org/jira/browse/HDFS-2701
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>         Attachments: hdfs-2701.txt, hdfs-2701.txt, hdfs-2701.txt
> Let's rename the various "processIOError" methods to be more descriptive. The current
code makes it difficult to identify and reason about bug fixes. While we're at it let's remove
"Fatal" from the "Unable to sync the edit log" log since it's not actually a fatal error (this
is confusing to users). And 2NN "Checkpoint done" should be info, not a warning (also confusing
to users).
> Thanks to HDFS-1073 these issues don't exist on trunk or 23.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message