Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 932D5CF1A for ; Mon, 7 May 2012 18:43:13 +0000 (UTC) Received: (qmail 85018 invoked by uid 500); 7 May 2012 18:43:13 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 84985 invoked by uid 500); 7 May 2012 18:43:13 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 84975 invoked by uid 99); 7 May 2012 18:43:13 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 May 2012 18:43:13 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 May 2012 18:43:12 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 3C4EB437AAB for ; Mon, 7 May 2012 18:42:52 +0000 (UTC) Date: Mon, 7 May 2012 18:42:52 +0000 (UTC) From: "Robert Joseph Evans (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <1151832403.35748.1336416172248.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <382913428.30866.1336252310682.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HDFS-3376) DFSClient fails to make connection to DN if there are many unusable cached sockets MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HDFS-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269864#comment-13269864 ] Robert Joseph Evans commented on HDFS-3376: ------------------------------------------- Hey Todd, I have been trying to follow some of the fixes you have been putting into the HDFS socket caching. I was wondering if you would be willing to pull HDFS-3357 and this one, HDFS-3376, into branch-0.23. They both seem to apply cleanly, but I am not an HDFS committer to do this myself. > DFSClient fails to make connection to DN if there are many unusable cached sockets > ---------------------------------------------------------------------------------- > > Key: HDFS-3376 > URL: https://issues.apache.org/jira/browse/HDFS-3376 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client > Affects Versions: 2.0.0 > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Priority: Critical > Fix For: 2.0.0 > > Attachments: hdfs-3376.txt > > > After fixing the datanode side of keepalive to properly disconnect stale clients, (HDFS-3357), the client side has the following issue: when it connects to a DN, it first tries to use cached sockets, and will try a configurable number of sockets from the cache. If there are more cached sockets than the configured number of retries, and all of them have been closed by the datanode side, then the client will throw an exception and mark the replica node as dead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira