Return-Path: X-Original-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5FD741099C for ; Thu, 2 May 2013 12:32:18 +0000 (UTC) Received: (qmail 42655 invoked by uid 500); 2 May 2013 12:32:17 -0000 Delivered-To: apmail-hadoop-hdfs-dev-archive@hadoop.apache.org Received: (qmail 42588 invoked by uid 500); 2 May 2013 12:32:17 -0000 Mailing-List: contact hdfs-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-dev@hadoop.apache.org Delivered-To: mailing list hdfs-dev@hadoop.apache.org Received: (qmail 42529 invoked by uid 99); 2 May 2013 12:32:15 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 May 2013 12:32:15 +0000 Date: Thu, 2 May 2013 12:32:15 +0000 (UTC) From: "Yuyang Lan (JIRA)" To: hdfs-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HDFS-4788) Fix bug in bestNode function which caused 'Could not reach the block' exception even when there're nodes available MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Yuyang Lan created HDFS-4788: -------------------------------- Summary: Fix bug in bestNode function which caused 'Could not reach the block' exception even when there're nodes available Key: HDFS-4788 URL: https://issues.apache.org/jira/browse/HDFS-4788 Project: Hadoop HDFS Issue Type: Bug Reporter: Yuyang Lan class: org.apache.hadoop.hdfs.server.common.JspHelper function: bestNode(DatanodeInfo[] nodes, boolean doRandom, Configuration conf) This function is supposed to return the first (random) health node by performing socket check, but a bug caused the 'chosenNode' variable is actually only assigned at the 1st time, So if the first picked node is a dead one, then the socket check will fail 3 times against it in a row and finally throw a "Could not reach the block containing the data" IOException, no matter if other nodes are alive or not. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira