hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3934) duplicative dfs_hosts entries handled wrong
Date Thu, 23 May 2013 21:03:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665668#comment-13665668

Aaron T. Myers commented on HDFS-3934:

The latest patch looks pretty good to me, Colin. A few comments:

# Recommend you add @VisibleForTesting annotation to DataNode#getXferPort.
# Recommend you change this exception message to include the full entry string, not just the
port part that couldn't be parsed, and mention explicitly that it was the port component that
couldn't be parsed:
+          throw new IOException("invalid number format when parsing " +
+              portStr, e);
The fact that it was an invalid number and what the number was will already be contained in
the message for the NumberFormatException:
# Recommend you add something to this warning message to make it clear that this is expected
if using the DN registration name feature, and to make it clear that this was encountered
when reading an include or exclude file:
+        LOG.warn("unknown host " + prefix, e);
# In HostFileManager.EntrySet#find(Entry), since right after this we uncondtionally return
null, you can condense this code into a single {{if (...)}} condition which first checks that
"{{ceil != null}}":
+        if (ceil == null) {
+          return null;
+        }
+        if (ceil.getValue().getIdentifier().equals(
+              toFind.getIdentifier())) {
+          return ceil.getValue();
+        }
# In HostFileManager.MutableEntrySet#add(DatanodeID), are we guaranteed that datanodeID.getXferPort()
>= 0? Perhaps we should assert that?
# Perhaps make HostFileManager.EntrySet.index protected?
# I see the purpose of delaying the throwing of the errors in HostFileManager#refresh, but
you might want to add a comment explaining it, since it's not super obvious. I'd also recommend
adding something to the log messages in that method along the lines of "failed to read exclude
file, continuing to use previous list of excluded nodes" to make it clear what happens in
this case.
# Perhaps I'm missing something, but why have separate classes for EntrySet and MutableEntrySet?
The only time we use just the normal EntrySet is for the initial empty sets, so seems like
we should just have a single class.
# Seems like you should do an s/DataNode/NameNode/g in this comment:
+    // These entries will be de-duped by the DataNode, since they refer
+    // to the same IP address + port combo.
# I don't think this code is doing anything in TestDecommission#testDuplicateHostEntries:
+    info = client.datanodeReport(DatanodeReportType.DEAD);
# Seems like you could replace all of this code with just three asserts: two calls to Map#contains(...),
and one check that Map#size() == 2:
+    Iterator<Map.Entry<String, DatanodeInfo>> iter =
+            deadByXferAddr.entrySet().iterator();
+    boolean foundPort1 = false, foundPort2 = false;
+    while (iter.hasNext()) {
+      Map.Entry<String, DatanodeInfo> entry = iter.next();
+      DatanodeInfo dn = entry.getValue();
+      if (dn.getXferPort() == port1) {
+        foundPort1 = true;
+        iter.remove();
+      } else if (dn.getXferPort() == port2) {
+        foundPort2 = true;
+        iter.remove();
+      }
+    }
+    Assert.assertTrue("did not find a dead entry with port " + port1,
+        foundPort1);
+    Assert.assertTrue("did not find a dead entry with port " + port2,
+        foundPort2);
+    Assert.assertTrue(deadByXferAddr.isEmpty());
# I like that you make a copy of the Configuration object in testIncludeByRegistrationName,
and recommend you do the same in testDuplicateHostsEntries, just to minimize the likelihood
of inter-test interference.

I'll be +1 once these are addressed.

Daryn (or anyone who's intending to review this) - please do so shortly. I'll be committing
this soon after Colin posts a patch addressing these comments unless I hear from someone else.
> duplicative dfs_hosts entries handled wrong
> -------------------------------------------
>                 Key: HDFS-3934
>                 URL: https://issues.apache.org/jira/browse/HDFS-3934
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.0.1-alpha
>            Reporter: Andy Isaacson
>            Assignee: Colin Patrick McCabe
>            Priority: Minor
>         Attachments: HDFS-3934.001.patch, HDFS-3934.002.patch, HDFS-3934.003.patch, HDFS-3934.004.patch,
HDFS-3934.005.patch, HDFS-3934.006.patch, HDFS-3934.007.patch, HDFS-3934.008.patch, HDFS-3934.010.patch,
HDFS-3934.011.patch, HDFS-3934.012.patch, HDFS-3934.013.patch, HDFS-3934.014.patch
> A dead DN listed in dfs_hosts_allow.txt by IP and in dfs_hosts_exclude.txt by hostname
ends up being displayed twice in {{dfsnodelist.jsp?whatNodes=DEAD}} after the NN restarts
because {{getDatanodeListForReport}} does not handle such a "pseudo-duplicate" correctly:
> # the "Remove any nodes we know about from the map" loop no longer has the knowledge
to remove the spurious entries
> # the "The remaining nodes are ones that are referenced by the hosts files" loop does
not do hostname lookups, so does not know that the IP and hostname refer to the same host.
> Relatedly, such an IP-based dfs_hosts entry results in a cosmetic problem in the JSP
output:  The *Node* column shows ":50010" as the nodename, with HTML markup {{<a href="http://:50075/browseDirectory.jsp?namenodeInfoPort=50070&amp;dir=%2F&amp;nnaddr="

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message