hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4417) HDFS-347: fix case where local reads get disabled incorrectly
Date Tue, 22 Jan 2013 03:10:13 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13559333#comment-13559333

Todd Lipcon commented on HDFS-4417:

+  @VisibleForTesting
+  public void killDataXceiverServer() {
+    if (dataXceiverServer != null) {
+      ((DataXceiverServer) this.dataXceiverServer.getRunnable()).kill();
+      this.dataXceiverServer.interrupt();
+      dataXceiverServer = null;
+    }
+  }

Think you forgot to delete this attempt that you didn't end up using. Also the removal of
the assert in {{kill}} shouldn't be in the patch anymore.


+      return Mockito.mock(DomainSocket.class, 
+          new Answer<Object>() {
+            @Override
+            public Object answer(InvocationOnMock invocation) throws Throwable {
+              throw new RuntimeException("...");
+          } });

Can you add a one-line comment explaining this, like 'Return a mock which always throws exceptions
on any of its function calls'? Also, fill in the exception text with something like "Injected
fault" instead of "..."


Looks like your patch might be missing the new test case? I don't see anyone set the {{tcpReadsDisabledForTesting}}
flag, nor the {{TestParallelShortCircuitReadUnCached}} class you mentioned.
> HDFS-347: fix case where local reads get disabled incorrectly
> -------------------------------------------------------------
>                 Key: HDFS-4417
>                 URL: https://issues.apache.org/jira/browse/HDFS-4417
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode, hdfs-client, performance
>            Reporter: Todd Lipcon
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-4417.002.patch, HDFS-4417.003.patch, hdfs-4417.txt
> In testing HDFS-347 against HBase (thanks [~jdcryans]) we ran into the following case:
> - a workload is running which puts a bunch of local sockets in the PeerCache
> - the workload abates for a while, causing the sockets to go "stale" (ie the DN side
disconnects after the keepalive timeout)
> - the workload starts again
> In this case, the local socket retrieved from the cache failed the newBlockReader call,
and it incorrectly disabled local sockets on that host. This is similar to an earlier bug
HDFS-3376, but not quite the same.
> The next issue we ran into is that, once this happened, it never tried local sockets
again, because the cache held lots of TCP sockets. Since we always managed to get a cached
socket to the local node, it didn't bother trying local read again.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message