Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DB1B39239 for ; Mon, 7 May 2012 17:51:12 +0000 (UTC) Received: (qmail 49996 invoked by uid 500); 7 May 2012 17:51:12 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 49955 invoked by uid 500); 7 May 2012 17:51:12 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 49776 invoked by uid 99); 7 May 2012 17:51:12 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 May 2012 17:51:12 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 May 2012 17:51:10 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 540B0437B9C for ; Mon, 7 May 2012 17:50:49 +0000 (UTC) Date: Mon, 7 May 2012 17:50:49 +0000 (UTC) From: "Tsz Wo (Nicholas), SZE (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <285254548.35466.1336413049357.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <382913428.30866.1336252310682.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HDFS-3376) DFSClient fails to make connection to DN if there are many unusable cached sockets MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269816#comment-13269816 ] Tsz Wo (Nicholas), SZE commented on HDFS-3376: ---------------------------------------------- {code} + // Don't use the cache on the last attempt - it's possible that there + // are arbitrarily many unusable sockets in the cache, but we don't + // want to fail the read. {code} Just a question: Will the unusable sockets be closed and removed from the cache? > DFSClient fails to make connection to DN if there are many unusable cached sockets > ---------------------------------------------------------------------------------- > > Key: HDFS-3376 > URL: https://issues.apache.org/jira/browse/HDFS-3376 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client > Affects Versions: 2.0.0 > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Priority: Critical > Fix For: 2.0.0 > > Attachments: hdfs-3376.txt > > > After fixing the datanode side of keepalive to properly disconnect stale clients, (HDFS-3357), the client side has the following issue: when it connects to a DN, it first tries to use cached sockets, and will try a configurable number of sockets from the cache. If there are more cached sockets than the configured number of retries, and all of them have been closed by the datanode side, then the client will throw an exception and mark the replica node as dead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira