hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Joseph Evans (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3376) DFSClient fails to make connection to DN if there are many unusable cached sockets
Date Mon, 07 May 2012 21:24:49 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270022#comment-13270022
] 

Robert Joseph Evans commented on HDFS-3376:
-------------------------------------------

Todd,

You are much more of an expert on this then I am.  I think HADOOP-8280 and HADOOP-8350 look
fine to pull in too.  Thanks for the help with this.

Aaron,

I spoke with Suresh off-line about it when I took over release manager for branch-0.23, as
I was curious about it.  He thought that I could not.  I don't really see it being too much
of a problem just yet, because there have not been very many HDFS issues that are applicable
to branch-0.23.  Although I am in the process of going through the full HDFS list to see if
I have missed anything.
                
> DFSClient fails to make connection to DN if there are many unusable cached sockets
> ----------------------------------------------------------------------------------
>
>                 Key: HDFS-3376
>                 URL: https://issues.apache.org/jira/browse/HDFS-3376
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client
>    Affects Versions: 2.0.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 2.0.0
>
>         Attachments: hdfs-3376.txt
>
>
> After fixing the datanode side of keepalive to properly disconnect stale clients, (HDFS-3357),
the client side has the following issue: when it connects to a DN, it first tries to use cached
sockets, and will try a configurable number of sockets from the cache. If there are more cached
sockets than the configured number of retries, and all of them have been closed by the datanode
side, then the client will throw an exception and mark the replica node as dead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message