hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hsieh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1787) "Not enough xcievers" error should propagate to client
Date Sat, 11 Jun 2011 18:26:58 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13047962#comment-13047962
] 

Jonathan Hsieh commented on HDFS-1787:
--------------------------------------

{quote}
> Text.readString can throw IOException. The InternalDataNodeException thrown on the next
line is also a subclass of IOException. Behaviorwise it would essentially use the same error
recovery path.

However, we will loss the information like socket addresses.
{quote}

I believe this is already an error path, but I'll look into this more.

{quote}
Some comments:

Please combine them into one message.
{code}
+          DFSClient.LOG.warn("Failed to connect to" + targetAddr +": "
+              + ex.getMessage());
+          DFSClient.LOG.warn(" Adding to deadNodes and continuing");
{code}
{quote}

My plan is to add \n's to the log message.

{quote}
{code}
It is better to log the exception.
+      } catch (IOException e) {
+        // preserve previous semantics, eat the exception.
+      }
{code}
{quote}

Will add logging.

{quote}
Do we really need internalDNErrors and getInternalDNErrorCount()? It is only used in the tests.
{quote}

Can you suggest an alternate mechanism for (automated) testing of the changes other than visual
inspection of the logs?  

This tests that the error messaging path was exercised and actually provides some information
that may be useful in trouble shooting.   I believe there are annotations in the works that
are semantically mean "public for testing but otherwise private/package". I believe the comment
I added would make this reasonably easy to find when this gets integrated throughout.

> "Not enough xcievers" error should propagate to client
> ------------------------------------------------------
>
>                 Key: HDFS-1787
>                 URL: https://issues.apache.org/jira/browse/HDFS-1787
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Hsieh
>              Labels: newbie
>             Fix For: 0.23.0
>
>         Attachments: hdfs-1787.2.patch, hdfs-1787.3.patch, hdfs-1787.3.patch, hdfs-1787.5.patch,
hdfs-1787.patch
>
>
> We find that users often run into the default transceiver limits in the DN. Putting aside
the inherent issues with xceiver threads, it would be nice if the "xceiver limit exceeded"
error propagated to the client. Currently, clients simply see an EOFException which is hard
to interpret, and have to go slogging through DN logs to find the underlying issue.
> The data transfer protocol should be extended to either have a special error code for
"not enough xceivers" or should have some error code for generic errors with which a string
can be attached and propagated.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message